Bug in Plex Movie Agent: Overwrites metadata with TMDB data given urlopen read timeouts

jonkv · March 26, 2017, 7:28pm

I’m trying to move my movie library over to IMDB ratings, so I set the ratings source to IMDB and refreshed metadata for that entire library. Quite a few movies still used TMDB ratings.

I refreshed everything again. Imagine my surprise when some movies that used to have IMDB ratings now have TMDB ratings instead!

Reading the log files and going through the PlexMovie.bundle code reveals the following:

The agent always begins by reading TMDB data, then overwrites that data with IMDB data for those fields that were not explicitly set to use TMDB as a source.
Sometimes the agent succeeds in reading from TMDB but fails to read anything from IMDB, resulting in the following messages in com.plexapp.agents.imdb.log:

Plex_Movie_FAILED - Error obtaining Plex movie data for [index]: <urlopen error ('The read operation timed out',)>

This means that the TMDB values are still there, and are never “overwritten” by IMDB values in the new metadata object being constructed in memory. The problem is that the agent doesn’t terminate the update. It continues, resulting in followup exceptions and log messages because the ‘movie’ variable is undefined/None so the agent can’t extract the expected information from it:

Error obtaining extras for 0468569: 'NoneType' object has no attribute 'xpath'
Error obtaining Rotten tomato data for 0468569: 'NoneType' object has no attribute 'xpath'

Then it saves the TMDB-based data, causing the old IMDB-based information that already existed in my library to be overwritten:

Serializing to /var/lib/plexmediaserver/Library/Application Support/Plex Media Server/Metadata/Movies/1/5c3ebc654f842202387b9b25177ff74255acf8d.bundle/Contents/com.plexapp.agents.imdb/Info.xml

Since I’m likely to get at least some IMDB timeouts every time I refresh the metadata for an entire library, it seems this is never going to work unless I manually refresh exactly those entries that failed. If I schedule metadata to be refreshed periodically I’m also likely to get entries that oscillate between IMDB and TMDB.

Unless the cache takes care of this to some extent, so that we eventually reach a fixpoint? In any case, this seems like a clear bug. The agent already raises a RuntimeWarning if there is a response that doesn’t contain a title; it just ignores the case where there is no response at all.

rossinior · March 26, 2017, 7:34pm

I have the same symptons with Plex Movie Agent using OFDB ratings.
Therefore, +1 for “memorizing the source of any info” which get weighed during the refresh process.

jonkv · March 27, 2017, 5:35pm

Also, the default timeout of 20 seconds does not seem to be enough for many of the ‘https://meta.plex.tv/m/[id]?lang=en&ratings=1&reviews=1&extras=1’ pages. I repeatedly get timeouts where PlexMovie tries to retrieve those pages and then I naturally won’t get any IMDB information. Loading the same pages in a browser sometimes works, eventually.

jonkv · February 17, 2018, 7:47am

This is still happening.

Additional technical information: The code in question is in PlexMovie.bundle/Contents/Code/init.py, in update(self, metadata, media, lang). One of the first things to do is to read TMDb metadata:

get_tmdb_metadata(guid, lang, metadata)

Around 15 lines later, it tries to read more information from IMDB. The intention is for this to supplement the TMDb metadata, but also to override it in case the agent is configured to (for example) use ratings from IMDB instead of TMDb.

    url = FREEBASE_URL % (guid, lang)
    movie = None

    try:
      movie = XML.ElementFromURL(url, cacheTime=CACHE_1WEEK)

      # ... around 130 lines of code ...

    except Exception, e:
      Log('Plex_Movie_FAILED - Error obtaining Plex movie data for %s: %s' % (guid, str(e)))

Note that FREEBASE_URL is ‘https://meta.plex.tv/m/%s?lang=%s&ratings=1&reviews=1&extras=1’. In other words, this is trying to retrieve information from a Plex server.

Sometimes this server is slow and the call to XML.ElementFromURL times out. This causes XML.ElementFromURL to raise an exception. As shown above, this exception is caught and logged. But after logging the exception, the agent simply continues. This means that:

Plex may already have had IMDB ratings for this particular movie in its metadata, because another time (maybe weeks ago), the metadata refresh for this movie succeeded.
During this refresh, Plex found data from TMDb, including ratings.
It then tried to get data from IMDB, but failed. Had it succeeded, it would have overwritten the ratings in the metadata object being created with IMDB ratings. But now the new metadata object retains the TMDb ratings that were loaded at the beginning of the call to update().
Apart from the log message, the agent ignores this IMDB failure. It catches the timeout exception and continues as if nothing has happened.
When the call to update() ends, it returns all the metadata it found. This includes TMDb ratings, which we don’t want.
To the caller, it appears as if update() was completely successful. Therefore the central parts of Plex overwrite the old metadata (which had IMDB ratings) with new incomplete metadata (with TMDb ratings).

During those periods when meta.plex.tv is slow, this can happen for many movies. As a user I don’t necessarily go to every single affected movie and update each one individually. Instead I simply see that “something seems to be wrong here”, and I use “refresh all metadata” for the entire library. Some movies are “fixed” by this, while others that already had correct metadata are affected by timeouts during the new refresh. The result is that a new subset of movies lacks IMDB ratings. I try again, and yet another subset is affected by timeouts.

Note that this only occurs during times when meta.plex.tv is slow. To reproduce, you must either wait for such a period, intentionally make meta.plex.tv slow, or simulate timeouts, either by low-level networking tools or by randomly causing a failure in the Plex Movie bundle. To make the failure appear every time, you should be able to replace:

    try:
      movie = XML.ElementFromURL(url, cacheTime=CACHE_1WEEK)

with:

    try:
      raise RuntimeWarning("Simulating timeout")
      movie = XML.ElementFromURL(url, cacheTime=CACHE_1WEEK)

This raises an exception in exactly the same place where XML.ElementFromURL would raise an exception when there is a timeout.

I think the correct behavior would be to abort the entire update when some services are not reachable. This is merely due to timeouts and it should be possible to fetch correct data later. This would avoid overwriting existing data from the correct source with more up-to-date data from the wrong source, which the user is presumably not interesting in. (I want my ratings from the correct source all the time, otherwise they are not comparable.)

@ChuckPA, this is not specifically Linux-related but I’m not sure who else I should ask…

maxflix · February 17, 2018, 6:23pm

Thanks for the note @jonkv - so there’s a couple things to address here:

I understand your frustration with the plex metadata servers. In the last couple weeks we have made some major updates to the fleet and it seems like you ran into some of the issues that other users had, as well. We’ve since fixed them, so I invite you to clear your agent HTTP cache and try to Refresh All again. (https://support.plex.tv/articles/202967376-clearing-plugin-channel-agent-http-caches/)
When we were designing this agent, we decided that some metadata was better than no metadata, so when a single service fails we continue with the metadata that was available. I know that this can be frustrating for power users that want specific metadata or nothing at all, but we had to design for the common case where folks are less particular about the source of their metadata.

If you continue to have problems, let me know here, because I’d love to help!

jonkv · February 17, 2018, 8:00pm

@maxflix said:

If you continue to have problems, let me know here, because I’d love to help!

Actually I personally don’t have these problems anymore, because I’ve modified the agent in various ways to make it behave the way I want. I just happened to see other people referring to the same issue (https://forums.plex.tv/discussion/295051/plex-agent-not-downloading-imdb-rt-ratings-with-metadata) and thought I should send in a better report.

Technically there shouldn’t be any major problems involved in making the agents realize the difference between updating information (only from the preferred source because you already have an older version available) and getting information for a new movie where no information exists before (from whatever source is available), but I understand if this is not a high priority.

hammey666 · February 18, 2018, 8:18pm

Max is there any way you could walk me thru clearing those caches. I have the files just don’t know exactly what to clear. Thank you

maxflix · February 19, 2018, 4:49pm

https://support.plex.tv/articles/202967376-clearing-plugin-channel-agent-http-caches/

@hammey666 - this article should help you clear your caches. I would recommend clearing at least the ones names com.plexapp.agents.imdb and com.plexapp.agents.themoviedb

JDA88 · March 7, 2018, 9:25am

I don’t know if it’s related but I have troubles with some well know movies:

Error opening URL 'https://meta.plex.tv/m/0109830?lang=fr&ratings=1&reviews=1&extras=1'
Plex_Movie_FAILED - Error obtaining Plex movie data for 0109830: HTTP Error 404: Not Found
Error obtaining extras for 0109830: 'NoneType' object has no attribute 'xpath'
Error obtaining Rotten tomato data for 0109830: 'NoneType' object has no attribute 'xpath'

Considering the fact that 0109830 is … Forest Gump and that
https://meta.plex.tv/m/0109830?lang=fr&ratings=1&reviews=1&extras=1
and
https://meta.plex.tv/m/0109830?lang=en&ratings=1&reviews=1&extras=1
Return “Could not find metadata for requested item”

They might have incomplete data on meta.plex.tv, no?

Mike_P · September 23, 2019, 10:38am

Hey Jonkv. Can you tell me how you modified the Agent to get imdb ratings?
I seem to have this bug described in the forum where I no longer get imdb ratings. I’ve tried changing every setting I can without any success. I’m at my wits end.

tvebax · October 28, 2019, 9:35pm

Plex imdb ratings are broken, have been for years.

https://forums.plex.tv/t/imdb-ratings-doesnt-update-even-when-i-refresh-meta-data/433642/15

This goes for RT ratings too. And TMDB movie agent instead of Plex movie agent, does not fix it either.

No way to fix locally, so might as well save the time debugging.

system · January 26, 2020, 9:35pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Problem with ratings from IMDB General Discussions	4	63	January 7, 2020
[Persistent BUG!] MetaData being scraped all wrong! HELP! Desktops & Laptops server-windows	19	1194	March 17, 2020
Imdb ratings doesn't update even when I refresh meta data Metadata & Adding Files server-truenas	124	11045	November 25, 2020
Only 1 movie not showing IMDB or Rotten Tomatoes rating, only star with % Desktops & Laptops server-windows	9	1072	November 22, 2018
IMDB ratings out of date Plex Features server-windows	1	281	November 30, 2018

Bug in Plex Movie Agent: Overwrites metadata with TMDB data given urlopen read timeouts

Related topics