I have 20 movies that keep getting auto-matched to the wrong movie and produces duplicates. I’m using the “Plex Movie Scanner” scanner with “The Movie Database” agent.
I did the Plex Dance and the scan re-duplicates every time. I then split them apart and unmatch the bad movie. The problem is that the “Originally Available” date is set to some random date. The Plex Movie Scanner uses the year from this date to search the TheMovieDB.org API. This is why the movie is matching to the wrong movie. There is no media tags in my files that Plex is reading from. Where is it grabbing this random date from?
Example…
The 21 & Over (2013).mp4 movie get’s matched to 21 Jump Street (2012).
If I try to manual search I get the same incorrect result. Plex searches TheMovieDB.org with the title of “21 & Over (2013)” and year of “2017” instead of “21 & Over” with a year of “2013”.
The “Originally Available” date for the movie is set by Plex to “2017-05-02” before the matching and is apparently where Plex gets the year for the search.
I’m aware that I can manually fix these. It’s just that I have to track this and redo every time I set up Plex.
Many of the other duplicates are sequels that are released one year apart. Plex will always pick the most popular movie (which is usually the prequel) over an exact match. At the very least, the match should fail is there are two almost identical movies in the search results; as there are in these cases.
I wrote my own scanner against The Movie Database API and got 100% matches on a collection of over 4000 movies. I think the difference is that I give priority to an exact match and Plex relies too heavily on popularity. I get using popularity, but Plex needs to give priority to exact matches.
Examples…
/shares/MP4Movies/MP4 Movies 01/The Hills Have Eyes (2006).mp4
/shares/MP4Movies/MP4 Movies 01/The Hills Have Eyes 2 (2007).mp4
/shares/MP4Movies/MP4 Movies 01/The Hunger Games- Mockingjay - Part 1 (2014).mp4
/shares/MP4Movies/MP4 Movies 01/The Hunger Games- Mockingjay - Part 2 (2015).mp4
/shares/MP4Movies/MP4 Movies 01/Kill Bill- Vol. 1 (2003).mp4
/shares/MP4Movies/MP4 Movies 01/Kill Bill- Vol. 2 (2004).mp4
/shares/MP4Movies/MP4 Movies 01/Paranormal Activity (2009).mp4
/shares/MP4Movies/MP4 Movies 01/Paranormal Activity 2 (2010).mp4
/shares/MP4Movies/MP4 Movies 01/Saw IV (2007).mp4
/shares/MP4Movies/MP4 Movies 01/Saw V (2008).mp4
/shares/MP4Movies/MP4 Movies 01/Scary Movie (2000).mp4
/shares/MP4Movies/MP4 Movies 01/Scary Movie 2 (2001).mp4
/shares/MP4Movies/MP4 Movies 01/Scream (1996).mp4
/shares/MP4Movies/MP4 Movies 01/Scream 2 (1997).mp4
I’m figuring that pointing out flaws in the Plex Media Scanner isn’t going to get me anywhere. I am curious about the “Originally Available” date bug though. Instead, I just created code that compares the TheMovieDB ID from Plex to my collection program. At least I can easily identify all the mistakes and verify when Plex is accurate.
Her’es the SQL statement to pull the data from Plex…
SELECT
metadata_items.title AS ‘Title’,
metadata_items.year AS ‘Year’,
media_parts.file AS ‘Path’,
replace(substr(metadata_items.guid, 33, 1000), ‘?lang=en’, ‘’) AS ‘TheMovieDB ID’
FROM media_items
INNER JOIN media_parts ON media_parts.media_item_id = media_items.id
INNER JOIN metadata_items ON metadata_items.id = media_items.metadata_item_id
INNER JOIN library_sections ON library_sections.id = media_items.library_section_id
WHERE library_sections.name = ‘Movies’
ORDER BY metadata_items.title, metadata_items.year, media_parts.file
For “21 & Over,” what is the full path to the file and to where in that path does your library point? Using your second post as an example, is the path:
/shares/MP4Movies/MP4 Movies 01/21 & Over (2013).mp4
And does your library point to “MP4Movies,” or “MP4 Movies 01?” It’s generally recommended to place movies in a sub-directory named like the movie itself, particularly when using nested paths. As an example, for “The Hills Have Eyes” and “The Hills Have Eyes 2,” if your library points to “MP4Movies,” you’d likely want to have them in their own sub-directories:
/shares/MP4Movies/MP4 Movies 01/The Hills Have Eyes (2006)/The Hills Have Eyes (2006).mp4
/shares/MP4Movies/MP4 Movies 01/The Hills Have Eyes 2 (2007)/The Hills Have Eyes 2 (2007).mp4
However, if your library points to “MP4 Movies 01,” this likely wouldn’t be necessary.
Finally, have you checked the tags in the files’ MP4 headers to ensure the data there makes sense? Plex, by default, prioritizes this over file names; so, bad data in the header can cause issues. This potential issue can be mitigated by demoting “Local Media Assets” in the scanner priority list for the agents in Settings -> Agents -> Movies -> Plex Movie/The Movie Database.
21 & Over also matched to 21 Jump Street for me.
/Movies/21 & Over (2013)/21 & Over (2013).mkv
Try “21 and Over” instead. It matched correctly on my system.
/Movies/21 and Over (2013)/21 and Over (2013).mkv
“21 and Over” is listed in the Also Known As section on imdb.com. Sometimes using one of the alternate names helps when Plex has trouble matching a movie.
Interesting. “21 & Over (2013).mkv” matched correctly on my test system, using Plex Movie as the agent. I switched the test library over to The Movie Database and it detected as “21 Jump Street.”
I also tested with “The Hills Have Eyes” (1 & 2) and they were detected separately by Plex Movie. However, The Movie Database agent merged them.
And using the sub-folder approach I suggested earlier had no impact on this.
Yes, the full path for 21 & Over is /shares/MP4Movies/MP4 Movies 01/21 & Over (2013).mp4.
My library points to /shares/MP4Movies/MP4 Movies 01/.
All my media files have no tags. Also, I had disabled “Local Media Assets” from the “The Movie Database” agent before adding the library. I always do this for the same reason you just mentioned.
Thanks for the suggestions, but this particular bug is something else.
Thanks for testing. I did the same test earlier. The Plex Movie agent uses IMDb and it doesn’t mess up with “21 & Over (2013)”.
I also tested TheMovieDB.org manually (21 & Over y:2013) and using the API (https://api.themoviedb.org/3/search/movie?query=21%20%26%20Over&year=2013). Both times, The Movie Database correctly matches to 21 & Over (2013). The Plex Movie Scanner is not sending the title and year to TheMovieDB.org API cleanly. Even when I use the Plex matching feature and type in the correct title and year, it still messes up.
Thanks for the suggestion. First, I don’t use the Plex Agent (which uses IMDb). I use The Movie Database agent. I chose The Movie Database over IMDb because it has a documented API.
Second, I wrote code that names my files using TheMovieDB.org. I have it auto-replace NTFS invalid characters (e.g. /, , :, etc.), but otherwise they’re identical.
Your suggestion of renaming the files would work in this case, but I don’t want to get in the habit of naming my files to overcome Plex Movie Scanner issues. Also, it wouldn’t solve several other naming issue like sequels (e.g. Scream (1996) and Scream 2 (1997).
If you’re interested, Plex is previewing their new movie agent here:
All indications are that it’s extremely fast when compared to these legacy agents. They’re also actively looking for matching issues and correcting them.
That doesn’t help with this issue, obviously, but it’s interesting.
I read that article about the new Plex Movie Agent earlier tonight when Plex put a link to it at the top of my post. However, It’s for the Plex Movie agent (not The Movie Database agent). I don’t plan on moving to this agent because it uses IMDb. I like using TheTVDB and TheMovieDB because they have documented APIs and can be used outside of Plex without writing a web scraper. I still might try it later just to see how accurate it is. Thanks.
You’re the third person to suggest I use the new Plex Movie agent. I’m going to have to give it a try. Another reason I prefer TheMovieDB over IMDB is that there are dozens of movies where the years are different between them. For all of the ones that I spot checked, TheMovieDB was accurate. Many times it’s because IMDb is displaying the USA release date, but several other times it’s just wrong.