Problems with hardware transcoding -- stopped working suddenly

Server Version#: 1.40.2.8395
Player Version#: Web 4.125.1

Hello-

My hardware transcoding has stopped working that was previously working. My linux machine has the latest Nvidia drivers installed, and I see my GPU when running the nvidia-smi tool. When I start a transcode, it does not go to hardware, and fails within a few seconds.

Any ideas? Logs attached.
Plex Media Server Logs_2024-04-25_19-21-01.zip (2.1 MB)

According to the logs, the GPU is working.

  1. It’s being detected:
Apr 25, 2024 19:16:32.529 [131229884099384] DEBUG - [Req#467/Transcode] [FFMPEG] - Loaded sym: NvEncodeAPIGetMaxSupportedVersion
Apr 25, 2024 19:16:32.868 [131229884099384] DEBUG - [Req#467/Transcode] MDE: Cannot direct stream video stream due to profile or setting limitations
Apr 25, 2024 19:16:32.868 [131229884099384] DEBUG - [Req#467/Transcode] Codecs: testing h264 (decoder) with hwdevice nvdec
Apr 25, 2024 19:16:32.869 [131229884099384] DEBUG - [Req#467/Transcode] Codecs: hardware transcoding: testing API nvdec for device 'pci:0000:42:00.0' (GP104GL [Quadro P5000])
Apr 25, 2024 19:16:32.869 [131229884099384] DEBUG - [Req#467/Transcode] [FFMPEG] - Loaded lib: libcuda.so.1
  1. The transcoder is starting using the P5000 but is then stopped by the player.
Apr 25, 2024 19:16:36.054 [131229884099384] DEBUG - [Req#477/Transcode] TPU: hardware transcoding: using hardware decode accelerator nvdec
Apr 25, 2024 19:16:36.054 [131229884099384] DEBUG - [Req#477/Transcode] TPU: hardware transcoding: zero-copy support present
Apr 25, 2024 19:16:36.054 [131229884099384] DEBUG - [Req#477/Transcode] TPU: hardware transcoding: using zero-copy transcoding
Apr 25, 2024 19:16:36.054 [131229884099384] DEBUG - [Req#477/Transcode] [Universal] Using local file path instead of URL: /mnt/nas/tv_shows/Last Week Tonight with John Oliver (2014)/Season 11/Last Week Tonight with John Oliver (2014) - S11E09 - April 21 2024 [AMZN WEBDL-1080p][EAC3 2.0][h264]-NTb.mkv
Apr 25, 2024 19:16:36.054 [131229884099384] DEBUG - [Req#477/Transcode] TPU: hardware transcoding: final decoder: nvdec, final encoder: nvenc
Apr 25, 2024 19:16:36.055 [131229884099384] DEBUG - [Req#477/Transcode/JobRunner] Job running: CUDA_CACHE_PATH="/var/lib/plexmediaserver/Library/Application Support/Plex Media Server/Cache/Shaders/CUDA" FFMPEG_EXTERNAL_LIBS='/var/lib/plexmediaserver/Library/Application\ Support/Plex\ Media\ Server/Codecs/ad47460-4673-linux-x86_64/' X_PLEX_TOKEN=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx "/usr/lib/plexmediaserver/Plex Transcoder" -codec:0 h264 -hwaccel:0 nvdec -hwaccel_fallback_threshold:0 10 -threads:0 1 -hwaccel_output_format:0 cuda -hwaccel_device:0 cuda -ss 863 -analyzeduration 20000000 -probesize 20000000 -i "/mnt/nas/tv_shows/Last Week Tonight with John Oliver (2014)/Season 11/Last Week Tonight with John Oliver (2014) - S11E09 - April 21 2024 [AMZN WEBDL-1080p][EAC3 2.0][h264]-NTb.mkv" -filter_complex "[0:0]hwupload[0];[0]scale_cuda=w=720:h=404:format=nv12[1]" -map "[1]" -codec:0 h264_nvenc -b:0 1959k -maxrate:0 2613k -bufsize:0 5226k -forced-idr:0 1 -r:0 29.969999999999999 -map 0:1 -metadata:s:1 language=eng -codec:1 copy -copypriorss:1 0 -f segment -segment_format matroska -segment_format_options live=1 -segment_time 1 -segment_header_filename header -segment_start_number 0 -segment_list "http://127.0.0.1:32400/video/:/transcode/session/f9vyf0dixiw4ad3d49zhgdbc/358eb46c-ef07-4cc7-9873-aee87b09466a/manifest?X-Plex-Http-Pipeline=infinite" -segment_list_type csv -segment_list_unfinished 1 -segment_list_size 5 -segment_list_separate_stream_times 1 -avoid_negative_ts disabled -map_metadata:g -1 -map_metadata:c -1 -map_chapters -1 "chunk-%05d" -start_at_zero -copyts -init_hw_device cuda=cuda:pci:0000:42:00.0 -filter_hw_device cuda -y -nostats -loglevel quiet -loglevel_plex error -progressurl http://127.0.0.1:32400/video/:/transcode/session/f9vyf0dixiw4ad3d49zhgdbc/358eb46c-ef07-4cc7-9873-aee87b09466a/progress
  1. What does nvidia-smi show? which driver version?
    (Sometimes the drivers themselves have bugs)

latest driver from Nvidia. I noticed the transcoding was not working after upgrading, but I am not certain the the driver upgrade caused the problem. Do you recommend me downgrading and testing?

root@plex:/opt/plexupdate# nvidia-smi
Thu Apr 25 20:04:31 2024
±----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.76 Driver Version: 550.76 CUDA Version: 12.4 |
|-----------------------------------------±-----------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Quadro P5000 Off | 00000000:42:00.0 Off | Off |
| 26% 36C P8 11W / 180W | 5MiB / 16384MiB | 0% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+

±----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
±----------------------------------------------------------------------------------------+

@mdlamoureux

You didn’t install the nvidia drivers which have been vetted by Ubuntu, have you? :thinking:

I have Driver Version: 535.161.08 (P2200) and one of Plex’s ninjas is running 535.171.04 (A4000) with the new Ubuntu 24.04 release.

We think you’ve likely created the issue by installing the unvetted/unmatched drivers.

Can you uninstall them and then apt install nvidia-drivers ?
(This will get you the vetted drivers)

Thanks for the pointers. Over the past few days I’ve been doing a lot of testing. Unfortunately, the biggest problem was that I do not know what change or how long hardware transcoding has not been working.

My ubuntu instance is a LXC on Proxmox. For years I’ve been using the configuration prescribed here: Plex GPU transcoding in Docker on LXC on Proxmox – jocke for my server.

Your comment pointed me to the differences between the 550 and 535 branches of the Nvidia driver. I’ve never used ubuntu apt to configure my driver since the guide above didn’t say to do so.

But, I did the following things, and so far in my testing I’m getting a consistent experience again. Mid-way through my troubleshooting, I started to get some transcodes working, and others failing. It was not just downgrading the drivers to the 535 series that resolved the issue.

The steps below were all of the changes I made, listing these steps in case it helps others. I’ll report back if I encounter issues. Unfortunately, I do not know which of the below changes was the one that was not configured right on my system.

  • I had upgraded to PVE 8.2. So, I pinned my configuration back to the 6.5 kernel.
  • I downgraded the Nvidia driver to the latest in the 535 line, which is 535.171.04. Of course on host and LXC.
  • I reinstalled the nvidia-persistenced-init service on the proxmox host
  • I monitored changes to my cgroup2 #'s and found that I was missing some, those were added.

Thanks for the pointers.

I’ve had similar issues on my Proxmox server where everything was working fine, and then one day HW transcoding through nVidia stopped working for windows clients, and then for the rest.

I did start a thread at HW Transcoding for Windows Clients failing

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.