Win10 Plex server crashing every few days from memory growth

Intel NUC 7i7BNH running Windows 10 Pro 21H2 19044.1415:
Processor Intel(R) Core™ i7-7567U CPU @ 3.50GHz 3.50 GHz
Installed RAM 32.0 GB (31.9 GB usable)
System type 64-bit operating system, x64-based processor

Plex server Version 1.25.3.5385

Clients are all sorts. Roku, AppleTV, ChromeCast, Samsung, etc…

I’ve been running my Plex server on a NUC for a few years but the last few months I’ve been seeing Plex crashes due to memory usage.
I started to take more notice of it because when memory usage got high, some of the thumbnails for media items on the Plex Home page would start showing up blank and no matter what I did they wouldn’t update. Then after it eventually crashed and I started it again, things were fine and all the thumbnails showed up again.

I started looking in the logs and noticed messages like:
Jan 05, 2022 18:40:32.209 [7244] ERROR - Format [JPEG] - Insufficient memory (case 4)
Jan 05, 2022 18:40:32.209 [7244] ERROR - Error resizing an image, we don’t trust what we cached [C:\Users\tooon\AppData\Local\Plex Media Server\Cache\PhotoTranscoder\b3\b3b4d07f229a7470821e0a74d6789a75a4c40770.jpg]

So I’ve come to a point where it’ll stay up for a day or two, but will eventually die because it’s running into that 2G memory limit. Some nights I check memory usage before bed and exit Plex and start a new one so that it doesn’t die unexpectedly overnight or the next day.

Sadly since the crashes are memory related, the few crash dirs that it’s been able to create are all empty.

I’ve got a number of log zips from some of those occurrences of when it crashes, and from when the memory seems to balloon quickly and I stop/start it before bed.

I have Debug logging enabled, and also enabled LogMemoryUse and upped LogNumFiles to 50 to try and hold onto more logs for better troubleshooting, but I wish I could see more of what’s going on to correlate between logs. I’ve had vmmap running to be able to visualize the memory growth on an ongoing basis. I haven’t given procmon a shot yet though.

I also have some “Plex Transcoder.exe.####.dmp” files from …\AppData\Local\CrashDumps, not sure if those would be relevant or not.
Based on a recommendation from another thread, I looked in …\AppData\Local\Temp and found a bunch of files named logs.zip from the timeframe that these crashes have been happening, and when I rename them to have .zip ending, they turn out to be plex log zips. I’m guessing these may be from instances where it’s crashed and it’s trying to build the crashreport zip, but due to the memory condition it’s unable to complete it.

Interestingly it crashed again just as I was writing this up:
Jan 08, 2022 21:38:40.393 [7000] ERROR - Thread: Couldn’t add a new thread to the pool of size 11: boost::thread_resource_error: Resource temporarily unavailable
Jan 08, 2022 21:38:40.393 [16292] ERROR - Thread: Uncaught exception running async task which was spawned by thread 7452: bad allocation

Thanks for the logs and transcoder dumps

With regards to the memory issues and suspected crashes relating to memory - could you please shutdown Plex Media Server and run this query on the database and let me know what is returned

select count(metadata_item_id),metadata_item_id from metadata_relations group by metadata_item_id order by count(metadata_item_id) desc limit 20

The database file to open with sqlite3 or gui tools such as SQLiteStudio is:

C:\Users\tooon\AppData\Local\Plex Media Server\Plug-in Support\Databases\com.plexapp.plugins.library.db

For transcoder crashes we will need to make sure there is server logs covering time of crash and will need to identify the media involved and would need sample that can recreate the crash

Thanks!
I get this from that query:

sqlite> select count(metadata_item_id),metadata_item_id from metadata_relations group by metadata_item_id order by count(metadata_item_id) desc limit 20;
63|85153
62|37757
59|76781
52|37328
52|38193
51|80348
49|79801
48|56227
48|86224
45|77733
43|54998
41|73730
40|78886
40|79584
39|56004
38|56005
37|77960
37|80998
37|81961
36|74390

As far as those transcoder crash dumps, I’ll see if I can correlate their times with any of the logs I’ve got.

I have the same issue.

The Plex Transcoder crashes on the 29th were all at the start of plays by the same user.
He watched 6 episodes of a TV show and then a movie. The first episode didn’t leave a dump, but the next 5 did, and the movie did.
But they all played for him without any issues.

These are the timestamps of those .dmps and the logs are in this log dump: Plex Media Server Logs_2021-12-30_12-38-53.zip

11:48:33 – This one didn’t drop a Transcoder dmp
12:08:40
12:34:14
14:20:11
14:47:36
15:23:22
17:44:38

I’ve got HW encoding enabled, but it’s on a NUC, so I’ve been thinking that it’d use this, which it does mention in the logs:

INFO - [Transcode] [FFMPEG] - MFT name: 'Intel® Quick Sync Video H.264 Encoder MFT'Intel® Quick Sync Video H.264 Encoder MFT

And they all say this

TPU: hardware transcoding: enabled, but no hardware decode accelerator found

But all of the ones that generated dumps (not the first one) end up bombing out with this same exit code:

DEBUG - Jobs: 'C:\Program Files (x86)\Plex\Plex Media Server\Plex Transcoder.exe' exit code for process 10964 is -1073741819 ()
DEBUG - Streaming Resource: Changing client to use software decoding

This is the profile that it’s using to set up his session

DEBUG - [Transcode] TranscodeUniversalRequest: using augmented profile Roku-7.x

I just tried to replicate the issue here at home on my Roku, forcing it to transcode down to 720p, but it successfully fires up hardware transcoding using dxva2 and qsv. I don’t see the same messages I saw from his crashes, using one of the same episodes he was using.

Overall, the transcoder crashes are less concerning to me as it seems they end up playing fine for the user. Also, it looks like you say that exit code is normal windows Access Violation process crash 0xC0000005 that tends to be specific to media file content.

I’m a lot more worried about why the server keeps dying. Looks like I don’t have any runaway metadata associations from that query.

Thanks - so this memory issue is different and is not related to a massive number of extras in the database.

Will have a look at the server logs for clues

Thanks

@tooon could you send me a copy of the PMS database - Download database and upload with the earlier diagnostic and send me by PM an updated link please

I will refer the issue to the development team. I can see examples of memory usage going up and then memory gets released but in last set of logs - the allocated memory remained

I would like to see what the database has for the items that were referenced around the time when the memory usage climbed from 164Mb to 509Mb between 14:23:18 and 14:23:22 on the 9th Jan - there was a transcode starting for Apple TV round the time - the first transcode in the logs for Apple TV

Two more crashes, one yesterday (18:20 on 1/12) and one today (16:58 on 1/13). They don’t necessarily look memory related, unless the memory usage balloons from 600mb to max in less than a second. I have no idea what’s going on at this point.
I’ve uploaded log bundles from both to the link that I sent you before.

Are these dmp files in %TEMP% zero length files?

2f694def-cb33-452e-8139-63e1fb9e27ab
532cca40-c265-4869-9445-f60a3c1f967d

If these crashes were not memory related then the dmp fiiles should be valid and not zero length files. Could you please look into %TEMP% for files named

2f694def-cb33-452e-8139-63e1fb9e27ab
532cca40-c265-4869-9445-f60a3c1f967d

They would probably have .dmp filename extension

The %TEMP% directory for the Plex MedIa Server process is here
C:\Users\tooon\AppData\Local\Temp\

Yes, those files exist, but don’t have .dmp extensions. I copied them into the “plex files from appdata-local-temp” subfolder in the google drive link, along with three other files that I think are dumps from last week without the .dmp extension.

Thank you

I have looked at the 3 good dump files and they were all for the Intel Graphics driver crashing when Plex Media Server was checking out hardware acceleration

The crashes are identical to what I looked into back in 2019 / 2020 and were happening with some intel driver versions. The results of that investigation was summarized in the table in this forum post

Suggest you check out the version of the drivers that you have and experiment with different versions - or bring up to date if you are running with old drivers

Th crashes that were due to the intel graphics driver were these

Jan 07, 2022 20:13 - 499d9ea3-8d35-4e67-b5c1-6ace2435cdbf
Jan 09, 2022 19:36 - 787e6212-f9d3-474d-bd63-026074c0e471
Jan 12, 2022 18:20 - 532cca40-c265-4869-9445-f60a3c1f967d

Looking at the logs for the other crashes

This crash looks the same sequence up to the crash time
So intel driver related
Jan 13, 2022 16:58
(there was a memory allocation failure about 4 hours earlier - it would not be relevant to the crash)

The crash at Jan 08, 2022 21:38:40. appears to be memory related

Not obvious what caused the crash at Jan 11, 2022 17:05:04
At the same time the EasyAudioEncoder.exe and Plex Tuner Service.exe processes exited with exception code 0x40010004 - perhaps the Windows System Event Log will have some clues as to what was happening at this time. The Plex Media Server may be itself crashed out with the same windows exception code

I suggest you resolve the intel driver issue and then see what we are left with and if remaining crashes would be just memory usage

Did you make any changes to settings etc after this memory related crash
Jan 08, 2022 21:38

Thanks for the update.
Looks like Jan 11, 2022 17:05:04 it was actually coming back up from a reboot. It had Windows patched and I had rebooted it. Strange that would end up right then.

I’ll jump on the graphics driver thing, I’ve looked through that thread in the past when I saw some strange transcoding issues a year or two ago and I had disabled HW transcoding at the time. While I’m at it I’ll take care of a couple of other driver updates as well.

I’m pretty sure I haven’t made any changes in the in-app settings or registry settings other than the two logging items (LogMemoryUse/LogNumFiles).

I was just wondering if you made any changes that stopped getting a repeat of the crashes due to high memory usage - the last such crasn was Jan 08, 2022 at 21:38

I did update to the new beta build that showed up, I think it was yesterday or the day before.

A friend on Appletv has been watching a few things today, two transcodes and now one direct stream. VMMap shows the committed memory climbing steadily from the server start over the last few hours to over 1.6G. The graph shows it’s leveled off there now that the direct stream is running.

This is one of the same users with AppleTV that you noted in the post with the graph above:
“when the memory usage climbed from 164Mb to 509Mb between 14:23:18 and 14:23:22 on the 9th Jan - there was a transcode starting for Apple TV round the time - the first transcode in the logs for Apple TV”

I just uploaded an mmp and screenshot from vmmap, along with a log bundle all from just now to the google drive link.

I threw together a script to scrape the Memory Usage from the log and put them into csvs. I was able to generate this chart of the memory jumps that correspond with the similar chart in vmmap.
Each of those spots at 18:09 and 18:56/57 where the memory use jumps and stays are when that AppleTV user starts an episode that transcodes. Most of the time that this user has been watching things, I’ve been watching a movie that’s direct streaming locally.

Mem usage up ~120MB over the last ~13hrs. I’ve had a handful of transcodes for Roku, Samsung, and an iOS direct stream, but seems mostly stable for the moment. I’ve uploaded a new vmmap screenshot and mmp, along with another log bundle. I’ll keep watching things.

Memory usage has continued to climb slightly, and I’ve started to notice movie poster thumbnails show up blank while browsing the library in plex. I’m also seeing these errors in the log about it:

Jan 16, 2022 14:39:11.900 [11660] ERROR - Format [JPEG] - DIB allocation failed, maybe caused by an invalid image size or by a lack of memory
Jan 16, 2022 14:39:11.900 [11660] ERROR - Error resizing an image, we don't trust what we cached

I’m guessing it’ll crash in the near future, although it seems most actual watching activity has proceeded without generating any issues. I haven’t had any AppleTV transcodes though, which is when I expect things to balloon like they have been. I’ve seen transcodes for rokus and samsungs, and a few AppleTV Direct Streams, but no AppleTV transcodes since that last big jump on the 14th. I’ve uploaded a new mmp and a log bundle to compare.

Memory is still high and growing slightly each day, but no crashes yet. Also no AppleTV/tvOS transcodes in days. There have been plenty of other transcodes and AppleTV Direct Streams but it seems to be fine with those. I’ve uploaded new mmps and log zips each of the last few days.
I’m considering reaching out to one of my AppleTV users to try a vid that’ll need to be transcoded to see if it’ll tip over.

It finally crashed today at 17:07, but had been almost unusable for a while before that today.

I took a look just after 13:00 today and saw in VMM that one of the colors, bold orange, had dropped off the visual graph completely. That color corresponds to the Heap. I was able to bring up the interface and browse around movies, but when I tried to bring up the settings to grab a log bundle, it told me it couldn’t load the settings. I saved another mmp from vmm at that time, and now I’ve got a log bundle now that I just restarted it. Both of those along with mmps and log bundles from the last few days are all up on the google drive link.

I was hoping one of my AppleTV users would watch something transcoded to make the memory use jump and crash, but I guess it just took some time after those initial ones caused it to spike shortly after the last crashes last week.