PLEX movie issue - CPU death spiral

You should try to view “Get Info” on the movie and then “View Xml” you will likely seem the same CPU spike without even having to play the movie.

I’ve been running the legacy TMDB agent on 1 server as a test and the “new” movie agent on the other. There’s nothing different between these Plex servers aside from the agent used on the movie libraries.

The legacy agent has yet to be affected (It’s been about 2 weeks now)

Last night I had a few restarts but I was unable to determine which movie was causing the issue. I have a script that attemps to load the XML for each item and prints the failed items but it seems like attempting to load the XML caused the item to be fixed. That server hasn’t crashed since.

I’d like to be able to determine which movie is failed so I can take a look in the DB and see what the metadata looks like for that item.

Thats basically what I did previously, but I only looked at the metadata_items table. I re-checked the table after refreshing the metadata and the only difference were various timestamps being updated and the extra_data field was slightly different. The “bad” entry had cloudAssetRefreshedAt=1677255425 and the refreshed entry was missing that param in the extra_data string. Didn’t seem to me like that would be the culprit… perhaps the answer lies in another table. It would be great to know what query was actually being run when the metadata is called with all those flags used by “View XML”

Was able to test this this morning since no one was using my server. No spike when viewing XML on that file. CPU still running at around the same ~40% while playing it., though. None of the other movies with the same codec and comparable bitrate do the same. Even tried a movie that I know taxes the system (it has trouble playing smoothly on some playback devices) - The Fifth Element. Both my 4K BluRay transfer and my 35mm Preservation copy produce normal, expected CPU load (up to 10-15% when initially loading, then back down to nearly 0% when playing). So not totally sure if this movie’s behavior is related to the thread topic, but I can 100% confirm my issue with Star Wars was exactly the same as what you described, until the file was fixed by either refreshing metadata or running analysis.

Something else that may or may not be related, I just viewed XML for, then played, the v1.0 version of Return of the Jedi’s 4K83 fan edit. It played fine, CPU never above 10% during load. But after stopping playback, it took nearly 2 minutes for the Plex app to become responsive again. Returning to the Movies library gave me blank posters with no text for ~90 seconds, and it took another 30 seconds or so for things to load in and behavior to return to normal. No CPU spikes during this event, but still seems like it might be somehow related as this only happened after RotJ and I’ve been testing movies for the last 20 minutes without issue. Either way, very curious.

Across the board, the related movies section takes ages to load, also, and it completely fails to load about 30% of the time.

Edit: Viewing XML on Empire Strikes Back v2.1 of the D+80 fan edit (4K HEVC Main 10, 60.9Mbps) spiked and maintained my CPU to nearly 20% for the full 90 seconds it took to load. Just viewing the XML. It took nearly a minute to start playback. It never spiked above ~10%, and it took ~a minute to drop down near 0%, which is about twice as long as normal. After stopping playback I went to look at the XML again. It still the same amount of time to load, but never went above 10% this time (same load as starting the movie). So it seems viewing XML may have “fixed” whatever caused the significant CPU spike?

Assuming the slowness is a database lookup issue… it could now just be cached? I suppose you could try a server restart and then see if its still fast or not. I’m not sure I’ve ever waited a full 90 seconds for the metadata to load, i think most people would give up if the playback starting was taking that long (and I think many clients timeout before that)

I decided to give this same movie a try. I added it and it took 5 mins to view xml. I have an older 24 core xeon setup and it pegged one core at 100% for the full 5 mins until the metadata xml returned.

That being said the movie is unplayable because playback calls the metadata, which takes 5mins to load and all the playback clients time out before 5mins. I’m sure a Refresh Metadata kick will resolve it, but seems to still be a mystery as to what the “bad” state actually is and what the lookup is doing that takes all these CPU cycles.

@anon18523487 it seems like this issue exists across various installations and databases. Are you guys able to replicate it? Do you need more information? How can we push forward to get an official bug and someone looking into it?

1 Like

@ChuckPa Could you provide some insight to the problem we’re having here? (I apologize for the ping. I’ve just been fighting this issue for over a month and I’m getting desperate.)

For me the issue is most prominent on my 4K and 4K DV libraries (Maybe because they’re duplicates of the items in the normal movie library?)

The XML for each item will load forever and doesn’t seem to respect any timeout value it should’ve been given. After 30 seconds or so the server will go unresponsive until it crashes by itself or the container gets restarted when the healthcheck fails.

The issue is present on all PMS versions from 1.29 to the latest beta. The old, legacy agents don’t seem to have this issue so it makes me think there’s an issue with the metadata itself that the Plex Movie agent uses?

@jseeley It is likely due to an issue with the database. No I can’t reproduce. Can you DM me your database (zipped please)?

Even zipped its 483MB; Maybe @Quick010 has a smaller example?

That’s fine. If you’re able to upload it somewhere, I have no issue downloading.

Edit - To be clear, I mean upload it to a sharing service like Google Drive and send me the link.

PM sent

Were we able to determine anything from jseeley’s db?

I’ve had to fully switch over to using the legacy movie agent because the new agent is completely unusable for me.

Also, it seems to be the same movies triggering this on completely unrelated servers. So I’m not convinced it’s a database issue. But obviously chase all possible solutions. Just making sure it’s clear that the same movies are triggering this on many servers.

Agreed. I’ve tried nearly every major PMS version going back nearly a year and they all have the same problem for me after about week of a fresh library scan/import. Previously those versions were completely stable.

The only thing I can think of that could’ve changed is the Movie Agent or the metadata itself.

Any issue grabbing the zip of my database? Still having trouble replicating the issue?

@anon18523487 anything else we can do to move this forward? Let me know if you need other logs or you couldn’t grab my zip file.

There are serveral other people I’ve chatted with that also see the same behavior, regardless of PMS version. They’re all using the default Plex Movie Agent.

I’ve moved my own server over to the legacy agent which is hardly a fix because it’s slow and inaccurate at matching… But it doesn’t crash my server.

Some help troubleshooting or acknowledgement of the issue would be appreciated.

Exactly my issue. Down to IncludeRelated. I’ve rebuilt my db with chucks tool sever multiple times, and don’t see any obvious db issues.
Do you have it on tv shows or just movies?

I don’t see it against the same library folder on a parallel ubuntu vm. I’l try copying the whole folder to a 2nd docker if I can swing it spacewise.
Crazy idea, could it have to do with the 64M /dev tmpfs mount? Could sqlite be choking on a large query? I explicitly mapped /tmp to /dev/shm and the problem persists.

I’m only seeing it with movies using the new Plex Movie agent. Series/Episodes don’t seem to have “related” content so they’re seemingly unaffected.

I’ve also used Chucks dbrepair and have gone as far as completely rebuilding the server from scratch with multiple different PMS versions going back to the early 1.29 builds that previously worked flawlessly. Which make me think the issue is with the metadata itself.