Eduo your post is very interesting.
I don't know how is the priority of this update request by Plex dev team. Eduo do you want to help me prototype a better agent ? I've some knowledge how metadata works and you have the OS expertise.
PS: soleol is my favorite app to fetch subs when plex didn't succeed. And I thank you for your works.
I've coupled this post with a stern one in the Developers forums in OpenSubtitles. I think that unless OpenSubtitles cleans this up the whole point of using it instead of other sources becomes moot.
Sure, I can help prototype a new agent, but who would code it? The current one could be forked to be improved, but I don't do Python.
I'm only familiar with opensubs via XBMC...
And it would seem to me the solution lay with opensubs not the agents we are using to find subtitles.
You may not be familiar with the way OpenSubtitles works and why it's usually chosen first when adding sources to Subtitles. This is OK, since it's not common knowledge or obvious unless you go into it. Oncleben13 mentioned it, but he assumed the people reading would already know.
Even if you got info from IMDB and TMDB it still wouldn't help since the SRT file needs to be for that particular encoded source you have which can be millions of rips all with different sync.
This is where the confusion exists. When searching in OpenSubtitles you don't search by name, but by "hash". Programs create a "hash" (think of it as a "digital fingerprint" of the file) and ask OpenSubtitles if it has subtitles matched to that hash and bytesize. This means that if there are a dozen subtitles for the same movie, the results will only include those that match the specific file you have.
This is a great advantage, when trying to get the right one for you. The problem comes when your hash is not matched to any video or is matched to the wrong videos. When it's the first the plug-in reverts to name-based search (little more it can do).
This means that if you convert or modify your videos in any way other than renaming them before getting subtitles, you lower the probability of getting subs for them, but also means that when it finds matches they're much better than your average search and are usually in sync with your video.
Renaming them lowers the probability of finding stuff by name, since OpenSubtitles keeps tracks of release names as well and results can be prioritized by that as well.
The problem is that OpenSubtitles is a very big database and errors and mismatches have creeped into it, and it's very hard to clean them up. Imagine you're a pretty careless user and you have all your movies and your subtitles in a single big folder. You might not care much since players do a job of sorting the ones named the same as your file.
But not all do, and some take ALL subtitles in, and present them to the user prioritizing by name, the best match first. This means that you may not notice but your player may be picking up all the other subtitles as valid.
Then you get your entrepreneur developer types (the ones that would code something like "take all subtitles in the folder and subfolder regardless of name") and he/she adds automatic upload of hash matches to OpenSubtitles (the intention is good, the results disastrous). You can see where this can lead, where a single movie hash is matched to a ton of subtitles (this needs to happen only once). Then OpenSubtitles, who knows already those subtitles belong to other movies, decides then that they're all the same movie. In a couple of days you've got such a mess it can take years to go back to a clean state again.
To your point: OpenSubtitles reports back the IMDB of the files it finds, while Plex has the IMDB of series and movies stored (from TMBD and TVDB). You can prioritize the results (which are already filtered by matched file hash) according to IMDB.
I long for the day when SRT files are scrapped entirely in favor of subtitle tracks incorporated into the source.
They are just too hit and miss for my good.
You might not have thought this through, since it's almost impossible to do except for very localized and specific situations.
Most of your subtitles out there are made by volunteers. The closer they are to the release date the less likely it is that they come from an "official" source (this is specially true for TV Shows). Since there are so many groups doing subtitles, there's always multiple versions of them, sometimes of equal quality.
Subtitles also evolve over time. If you were to get subtitles for a series released in DVD long ago they'd probably be of great quality, but last week's episode of Walking Dead most likely had lower-quality subs for a whole week.
Add to this that in some markets there're never "official" subtitles so all existing are translations. And that a lot of translations are well-meaning but useless Google Translate ones that just muddy up the pool.
Essentially, the further you get from the English language and the US release dates the worse the subtitles tend to be.
Your "ideal" would only be met the day releases come with subtitles from the source for all languages (this is never going to happen, as volunteer subtitles are the only way to get some of the media into markets it'll never reach officially).
If you're just talking about english subs and only for month-old-at-the-earliest videos, then it's perfectly feasible.
I've been pushing for a "master" subtitle selection for each video in OpenSubtitles for a while, but this has proven impossible to do. For them this would mean delivering a single suggested result (one for full CC, one for just spoken text, actually). Where one subtitle for each release is flagged as "the best" (until a better one comes along) and that's prioritized in clients.
One note: You might see what the problem with OpenSubtitles is: It requires people to upload matches between a subtitle and a video. This is done by only a few volunteers (I calculate the proportion to be around 10K to 1, downloaders vs. uploaders) since most players (like Plex) do not upload matches or do them wrong (like BSPlayer)-