PMS 1.32.8.7639 on linux crashing

Server Version#: 1.32.8.7639
Player Version#: Multiple - Roku, Android, Web UI
My Plex Media Server has begin crashing sporadically. When users try to play content they’ll get a short bit in before it kills the stream. My server instance in their browser will show offline.
The logs and crash dumps are huge and I don’t know what I’m looking for here. Journalctl is showing some Tautulli python errors like below:

Jan 07 21:15:58 plex Plex Media Server[3618388]: ****** PLEX MEDIA SERVER CRASHED, CRASH REPORT WRITTEN: /var/lib/plexmediaserver/Library/Application Support/Plex Media Server/Crash Reports/1.32.8.7639-fb6452ebf/PLEX MEDIA SERVER/53be1e0e-f2f0-4a81-22162084-ebd3e232.dmp
Jan 07 21:15:58 plex kernel: show_signal_msg: 3 callbacks suppressed
Jan 07 21:15:58 plex kernel: PMS ReqHandler[3929630]: segfault at 7f7acc618d4c ip 00007f7ae0577dc1 sp 00007f7acc610ec0 error 4 in libnvcuvid.so.535.129.03[7f7ae0561000+af7000]
Jan 07 21:15:58 plex kernel: Code: 00 00 00 00 00 0f 1f 40 00 41 54 55 49 89 f2 53 48 89 d5 31 c0 48 89 cb b9 21 00 00 00 41 b9 08 01 00 00 48 81 ec 10 01 00 00 <8b> 96 0c 6c 00 00 8b b6 f8 6b 00 00 48 89 e7 49 89 e4 f3 48 ab 4d
Jan 07 21:15:58 plex systemd[1]: plexmediaserver.service: Main process exited, code=killed, status=11/SEGV
░░ An ExecStart= process belonging to unit plexmediaserver.service has exited.
Jan 07 21:17:01 plex CRON[3930186]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
Jan 07 21:17:01 plex CRON[3930187]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Jan 07 21:17:01 plex CRON[3930186]: pam_unix(cron:session): session closed for user root
Jan 07 21:17:28 plex systemd[1]: plexmediaserver.service: State 'stop-sigterm' timed out. Killing.
Jan 07 21:17:28 plex systemd[1]: plexmediaserver.service: Killing process 3618427 (Plex Script Hos) with signal SIGKILL.
Jan 07 21:17:28 plex systemd[1]: plexmediaserver.service: Killing process 3618517 (Plex Script Hos) with signal SIGKILL.
Jan 07 21:17:28 plex systemd[1]: plexmediaserver.service: Killing process 3619588 (EasyAudioEncode) with signal SIGKILL.
Jan 07 21:17:28 plex systemd[1]: plexmediaserver.service: Killing process 3618537 (Plex Script Hos) with signal SIGKILL.
Jan 07 21:17:28 plex systemd[1]: plexmediaserver.service: Failed with result 'signal'.
░░ The unit plexmediaserver.service has entered the 'failed' state with result 'signal'.
Jan 07 21:17:28 plex systemd[1]: plexmediaserver.service: Consumed 1h 2min 25.652s CPU time.
░░ The unit plexmediaserver.service completed and consumed the indicated resources.
Jan 07 21:17:34 plex systemd[1]: plexmediaserver.service: Scheduled restart job, restart counter is at 1.
░░ Automatic restarting of the unit plexmediaserver.service has been scheduled, as the result for
Jan 07 21:17:34 plex systemd[1]: Stopped Plex Media Server.
░░ Subject: A stop job for unit plexmediaserver.service has finished
░░ A stop job for unit plexmediaserver.service has finished.
Jan 07 21:17:34 plex systemd[1]: plexmediaserver.service: Consumed 1h 2min 25.652s CPU time.
░░ The unit plexmediaserver.service completed and consumed the indicated resources.
Jan 07 21:17:34 plex systemd[1]: Starting Plex Media Server...
░░ Subject: A start job for unit plexmediaserver.service has begun execution
░░ A start job for unit plexmediaserver.service has begun execution.
Jan 07 21:17:34 plex systemd[1]: Started Plex Media Server.
░░ Subject: A start job for unit plexmediaserver.service has finished successfully
░░ A start job for unit plexmediaserver.service has finished successfully.
Jan 07 21:17:50 plex Plex Media Server[3930877]: Dolby, Dolby Digital, Dolby Digital Plus, Dolby TrueHD and the double D symbol are trademarks of Dolby Laboratories.

Jan 07 21:18:44 plex tautulli.tautulli[1196]: Job "force_stop_stream (trigger: date[2024-01-08 05:18:44 UTC], next run at: 2024-01-08 05:18:44 UTC)" raised an exception
Jan 07 21:18:44 plex tautulli.tautulli[1196]: Traceback (most recent call last):
Jan 07 21:18:44 plex tautulli.tautulli[1196]:   File "/snap/tautulli/1864/lib/apscheduler/executors/base.py", line 125, in run_job
Jan 07 21:18:44 plex tautulli.tautulli[1196]:     retval = job.func(*job.args, **job.kwargs)
Jan 07 21:18:44 plex tautulli.tautulli[1196]:   File "/snap/tautulli/1864/plexpy/activity_handler.py", line 642, in force_stop_stream
Jan 07 21:18:44 plex tautulli.tautulli[1196]:     row_id = ap.write_session_history(session=session)
Jan 07 21:18:44 plex tautulli.tautulli[1196]:   File "/snap/tautulli/1864/plexpy/activity_processor.py", line 185, in write_session_history
Jan 07 21:18:44 plex tautulli.tautulli[1196]:     section_id = session['section_id'] if not is_import else import_metadata['section_id']
Jan 07 21:18:44 plex tautulli.tautulli[1196]: TypeError: 'NoneType' object is not subscriptable
Jan 07 21:20:55 plex tautulli.tautulli[1196]: Job "force_stop_stream (trigger: date[2024-01-08 05:20:55 UTC], next run at: 2024-01-08 05:20:55 UTC)" raised an exception
Jan 07 21:20:55 plex tautulli.tautulli[1196]: Traceback (most recent call last):
Jan 07 21:20:55 plex tautulli.tautulli[1196]:   File "/snap/tautulli/1864/lib/apscheduler/executors/base.py", line 125, in run_job
Jan 07 21:20:55 plex tautulli.tautulli[1196]:     retval = job.func(*job.args, **job.kwargs)
Jan 07 21:20:55 plex tautulli.tautulli[1196]:   File "/snap/tautulli/1864/plexpy/activity_handler.py", line 642, in force_stop_stream
Jan 07 21:20:55 plex tautulli.tautulli[1196]:     row_id = ap.write_session_history(session=session)
Jan 07 21:20:55 plex tautulli.tautulli[1196]:   File "/snap/tautulli/1864/plexpy/activity_processor.py", line 185, in write_session_history
Jan 07 21:20:55 plex tautulli.tautulli[1196]:     section_id = session['section_id'] if not is_import else import_metadata['section_id']
Jan 07 21:20:55 plex tautulli.tautulli[1196]: TypeError: 'NoneType' object is not subscriptable

Tautulli is not new, I’ve been running it for 8? ish years without issue. I did recently add a GPU to my hypervisor host that I pass through to the Plex VM. I don’t know what I’m looking for to see if the GPU is suspect here.

Architecture:
Hypervisor host running on ESXi with Intel Xeon (Old old gen) and Nvidia GTX 970 GPU.
A Linux VM running Ubuntu Server 22.04.3 LTS (Jammy) which runs the Plex binaries as a systemd service.
Nvidia driver version is 535.129.03 (CUDA 12.2)
Plex Media Server Logs_2024-01-07_21-43-35.zip (2.7 MB)

It’s crashing in the Nvidia driver.

I don’t experience this on my P2200 so it might be a GPU card vs driver issue.

Would it be possible to get a sample- something just long enough to replicate what you’re seeing? Would a few hundred MB be enough? ( At 13 Mbps, 200-250 MB should let me see a few seconds without crash ). Please gauge appropriately.

As for making the sample,

  1. Guestimate the size
  2. Cut a sample using dd ; dd if=filename of=filename bs=1M count=XXX where XXX is the number of MB for the sample.
  3. Upload that sample somewhere and post the link .
    (I’ll open a PM to you if you prefer to share privately)

Taking another possible path –

  1. In ESXi, we need (at least in ESXi 7.x) to lock the guest’s memory into RAM when using a Nvidia GPU
  2. Have you confirmed that memory is reserved / locked in RAM?

I used to have ESXi on a machine here but don’t now so can’t show you what I’m referring to. Hopefully I’m making sense ?

Hey there, thanks for the response.

ESXi: Yes, I have memory reservation (locking) enabled on this VM.

Sample:
Are you looking for a video sample that’s been transcoded by the GPU or are you looking for a ‘virgin’ video that hasn’t yet been touched by the GPU?
I don’t know if I’ll be able to reproduce the issue on demand though, it’s just kinda happening here and there.

Yes, I was looking for a sample of the Madam Secretary file which your logs show.

Given this isn’t 100% reliable, something else must be afoot.

  • Are you using SR-IOV to share the GPU with other apps/VMs
  • How much memory does that card have?
  • When it fails, any idea how many simultaneous transcodes are occurring ?
    (Each transcode is 200-> 600 MB of GPU memory)

Sorry, a pre-transcode sample of the file or post-transcode? I’m thinking pre- since you suggested using dd.

SR-IOV isn’t supported by this card (GTX 970) so I’musing just normal vanilla PCI passthrough.
This VM is the only VM consuming the card and AFAIK plex is the only application using it, which seems accurate when I was looking at nvidia-smi and nvtop after the install. I only saw ffmpeg processes on it.
This card is a 4GB model. I believe only one transcode has been running when it occurs, but I cant answer with any certainty.

In the mean time I’ve removed the passthrough card from the Plex VM to give stability to my users while I muck around on this side.

Of note, I’m very seriously considering just getting a different GPU in the P-series. This thing is super old and is a consumer model, so it’s not exactly optimal for anything really.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.