Plex Deadlocking with to many hung connections issues for last 5 months Now!. (This ended up giving me anxiety attacks!) (Edited title for clarification)

Yep. I think that’s the last of the 1.22 releases? Anything in 1.23 or newer has had the deadlock issue for me.

What OS? Are you using Live TV?

Ubuntu 16.04.7, running Linuxserver.io’s Docker image on Docker 19.03.12.

Live TV is enabled but it’s not a thing I use all that often and wasn’t being actively used when sessions started to hang.

Why is this still an issue, it’s so frustrating, why keep adding new feature if they dont fix the core of the product…

Strange this went away for a good few months and now it seems to be happening again hmmm…

Yeah same. It started happening again a few days ago after I did an update and now it’s worse than ever before. The weird part is I tried going back down a few versions and it’s still happening.

My deadlocking came back from a Roku update that changed Live TV functionality but it was fixed very quickly and since then it hasn’t happened again.

If the deadlocking came back and going back to a working version of PMS does not solve the problem (as it was in my recent case) then my guess is the client app is causing the problem.

I only have Roku devices running at the moment so it was easy to isolate everything.

I have read the recent updates to this thread and there is indication of some deadlock issue coming back. I have nothing to investigate at this stage from these recent reports

So if anyone is getting deadlocks on the current beta 1.25.0.5246-cb2507e4d or public release 1.24.5.5173-8dcc73a59 please let me have the deadlock diagnostics to start investigating again

There are instructions here for deadlock diagnostics
For Windows: Plex falls offline, doesn't crash PMS.exe - #18 by sa2000
For Linux: PMS crashes since installing first public 1.15.x - #11 by sa2000

For linux - in case the dump does not get uploaded to the plex back end system, there should be a copy of it saved in $TMPDIR

I just got some time to reproduce the issue on 1.25.0.5246. SEGV’d the process and the dump should be uploaded.

If it didn’t make it let me know and I can attach the files here.

Cannot find any uploaded crash reports for your plex.tv account.

Also for deadlocks - it is not just crash reports that are needed to investigate. There are 3 separate bits of diagnostics - all needed:

  • the list of connections (response to the /connections endpoint request
  • the process dump from killing the process with -SEGV
  • the debug server logs

Have you reverted your version of Plex Media Server since this? plex.tv shows your server as running on 1.22.3.4523 and not 1.25.0.5246

The server logs (Plex Crash Uploader logs) would identify the mini dump file generated and attempted to be uploaded - the file gets copied to $TMPDIR for the docker PMS process after the attempt to upload

After looking at the crash uploader log, it seems that there were too many dumps trying to be uploaded and it was failing. Does the uploader process write to that log if it’s successful? I created another SEGV dump about a half hour ago and restarted the Plex service. The server log shows the uploader process running and exiting with code 0 but nothing new, success or failure, was written to the uploader log.

Plex Logs.zip (3.0 MB)

Logs and a copy of /connections are attached.

And yes, I’d reverted to 1.22.3.4523 after seeing 1.25.0.5246 start to have processes run away on the server until it’d consumed 32 threads at 100%. I’m currently running 1.25 this morning but will probably have to revert before this evening to have it be usable.

Thanks for looking into it.

Thank you - appears to be transcoding related issue. I do need the dump file and it did not make it to our back end system - dumps do not always get through even when logs suggest they did

For this kill -SEGV - it was at this time: Nov 24, 2021 10:57:59
I am surprised we did not create a Plex Crash Uploader.log when you restarted the system and the upload job ran

Nov 24, 2021 10:58:04.416 [0x7f836ce22b38] DEBUG - [JobRunner] Job running: '/usr/lib/plexmediaserver/CrashUploader' '--directory=/config/Library/Application Support/Plex Media Server/Crash Reports/1.25.0.5246-cb2507e4d' '--version=1.25.0.5246-cb2507e4d' '--platform=Linux' '--platformVersion=4.4.0-187-generic' '--serverUuid=2a482f0efbeb6c92a37f35029a780e0fb78a02bd' '--userId=xxxxxxx' '--sentryUrl=https://sentry.io/api/1233455/minidump' '--sentryKey=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' '--vendor=Docker' '--model=x86_64' '--device=Docker Container (LinuxServer.io)' '--allowRetries=0'
Nov 24, 2021 10:58:04.417 [0x7f836ce22b38] DEBUG - [JobRunner] Jobs: Starting child process with pid 1243
Nov 24, 2021 10:58:04.437 [0x7f836e09cb38] DEBUG - Jobs: '/usr/lib/plexmediaserver/CrashUploader' exit code for process 1243 is 0 (success)

The last Plex Crash Uploader.log file was from 02:38 am ! That file would have identified the dump filename

Could you look into where the file created at about Nov 24, 2021 10:57:59 ended up. We do copy it to $TMPDIR if it gets picked up by the crash uploader

If you do not find the file - we could try disabling crash reporting and then on next occurrence get connections list and logs and look for dump file in the Crash Reports folder for the release - eg
/config/Library/Application Support/Plex Media Server/Crash Reports/1.25.0.5246-cb2507e4d for 1.25.0.5246

Sorry for the week delay – I was away from home and finally got some time to work on troubleshooting this again.

I disabled crash reporting and -SEGV’d the process and it creates the folder with the version number but, as far as I can tell, it never creates the .dmp file. As a test I tried it on the 1.22 version I’ve been running and it does create a file, but it’s 0 KB.

Is there something I’m missing for generating these crash reports?

I can anecdotally confirm you seem to be on the right track and the issue is related to transcoding - I have disabled the transcoder entirely for the last two weeks and my server has been way more stable. I came to this hypothesis watching my Tautulli playback logs and noticing an overlap of server crashes and users transcoding.

FWIW, I usually run an nVidia Quadro RTX 4000 (on Windows 10) for transcoding - my troubleshooting process was that I used CPU-only at first to try and isolate the issue. That seemed to help, but not entirely eliminate the problem, so I shut off transcoding entirely and (knock on wood) everything seems to have stabilized.

I’m using a Quadro P400 but I can disable the hardware encoding without an issue – this is a dual Xeon machine so it’s not like I was really in need of the horsepower; just looking for a bit more efficiency.

What’s really strange about it is yeah, there are connections showing transcodes but there weren’t actually any session with playback when threads start to get stuck. It’s often Roku clients because most people watching off my server are on Roku but there are a couple Android TV clients (my own included) that sometimes show up as stuck sessions when I’m not watching.

The only cases I have seen of 0 kb dmp files is when we run out of memory - eg we have an issue where the database may contain hundreds of thousands of extras for a specific movie leading to memory allocation failures. There was no evidence of memory usage going excessive - you could enable memory usage logging - see advanced preference LogMemoryUse here
Advanced, Hidden Server Settings | Plex Support

Will need to get minidump creation to work

The dmp that i was after from the earlier deadock - could you try searching whole filesystem for it?

This seems to almost certainly have been the case for me. Disabling hardware transcoding solved the issue immediately on 1.25.

Looking back over the documentation it states that the minimum Nvidia driver version for Plex versions after 1.20 was 450, though I’d been running 430 (because of an old OS) for quite awhile and it didn’t seem to be a problem until jumping up to the 1.23 releases.

I finally got time to do a distro upgrade and got my server on the latest Ubuntu LTS and Nvidia drivers. Somewhere between a new OS and updated drivers the issue of ending up at 100% CPU usage seems to have resolved itself.

I did see this morning, though, that there were a couple of those same long-stuck (thousands of seconds) transcoder sessions as I was getting previously, though they weren’t consuming 100% CPU each. After the OS updates crash dumps look to be writing correctly as well.

It it’s probably not as critical a thing for me so far since those sessions don’t seem to be killing the server, but I’ve attached all the logs to see if there’s something there that’s useful. If there’s anything else that’d be helpful for debugging please let me know.

2021-12-09 - Plex Logs.zip (5.0 MB)

Good that you can now get crash dump files !

I could not see any hung requests in the logs. There are a few paused transcode playbacks - paused for over 10 hours - but no indication of any deadlocks

If you get lockouts again - please get me fresh diagnostics - connections list, debug server logs and a forced dump file

Well this had been a non issue for months and suddenly in the last week it has started up again, when I get home I will pull the diagnostic bits, but when I checked it earlier today before I killed the process I had 200 connections ffs.

Why has this become a problem yet again?

@sa2000 it’s been almost a year for some people looming at this thread why has the team not fully investigated this?

causes for deadlocks do get addressed and fixed - but we probably have new ones popping up. Unfortunately they are not easy to investigate and for each case need to capture the connections list, process dump and debug server logs

There is also possibility of some being transient if there is a heavy load eg some sonic analysis running at the time

Would need to separate transient lockout issues from permanent deadlock problems and so would like to see those where the connection count is very high

May need to [increase the number of log files]( Advanced, Hidden Server Settings | Plex Support) to get longer period covered by the logs

1 Like