Hardware Transcoding Not Working

Server Version#: Version 1.40.4.8679
Player Version#: Plex Web Version 4.134.2
cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 535.183.01 Sun May 12 19:39:15 UTC 2024

What logs should I check to see why hardware transcoding isn’t working?

Questions:

  1. Ubuntu/distro vetted drivers?
  2. installed additional libnvidiadecode and libnvidiaencode packages
  3. nvidia-smi shows the card detected and available?
  4. If Nvidia card added after Plex installed, is plex a member of the group which owns /dev/dri/card0 & renderD128 ?

I was able to figure this out. Looked at the logs and saw this issue:

Cannot load libnvcuvid.so.1
Failed loading nvcuvid.
Failed setup for format cuda: hwaccel initialisation returned error.
fallback to software decoding

Based on the version of the Nvidia driver I had, I was able to run a single apt install in order to get hw transcoding working.

sudo apt install libnvidia-decode-535-server

while troubleshooting, it looks like I’m not using nvidia-smi. I’ve got HW Transcoding working now, is there any reason to install nvidia-smi?

Which CPU? If an Intel with QSV, and nvidia is working correctly,
You’ll be able to select which one.

Settings - Server - Transcoding - SHOW ADVANCED.

Now scroll down and select which GPU to use.

nvidia-smi comes with the nvidia drivers

In the first screenshot you can see that I already have my Nvidia card selected to Transcode with.

Also, I posted the results of a cat /proc/driver/nvidia/version

cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 535.183.01 Sun May 12 19:39:15 UTC 2024

So, I’ve got the drivers and hw transcoding is working. I’m probably not going to poke it under fear of breaking something, but I was just wondering if there’s any reason for me to use nvidia-smi or just stick with what I have now. I’m pretty sure I’m using the packages listed here: https://ubuntu.com/server/docs/nvidia-drivers-installation (I’ve had this system running for multiple years and the drivers are being updated with apt.)

The magic came here. Showing you had the drivers was fine.

Not having the decoder and encoder is where it stopped.
PMS saw the card but could not use it. That’s why I asked about those specific packages

Note: this broke for me again and I don’t know why. Troubleshooting seemed to be pointing to mismatching driver versions. I ended up installing nvidia-smi and either that or one of the dependencies fixed my issue. Hopefully this will ensure that encoding/decoding via Video Card is more reliable. We shall see…

@Dhalgren

Please get and share (which should mostly look like below)

[chuck@lizum ~.2000]$ dpkg -l | grep ^ii | grep nvidia | awk '{print $2}'
libnvidia-cfg1-550:amd64
libnvidia-common-550
libnvidia-compute-550:amd64
libnvidia-compute-550:i386
libnvidia-decode-550:amd64
libnvidia-decode-550:i386
libnvidia-egl-wayland1:amd64
libnvidia-egl-wayland1:i386
libnvidia-encode-550:amd64
libnvidia-encode-550:i386
libnvidia-extra-550:amd64
libnvidia-fbc1-550:amd64
libnvidia-fbc1-550:i386
libnvidia-gl-550:amd64
libnvidia-gl-550:i386
libnvidia-ml-dev:amd64
nvidia-compute-utils-550
nvidia-cuda-dev:amd64
nvidia-cuda-gdb
nvidia-cuda-toolkit
nvidia-cuda-toolkit-doc
nvidia-dkms-550
nvidia-driver-550
nvidia-firmware-550-550.107.02
nvidia-kernel-common-550
nvidia-kernel-source-550
nvidia-prime
nvidia-profiler
nvidia-settings
nvidia-utils-550
nvidia-visual-profiler
screen-resolution-extra
xserver-xorg-video-nvidia-550
[chuck@lizum ~.2001]$ 

Which card are you using?

EDIT: nvidia-detector will tell you which drivers are recommended for your systems. This is how Ubuntu/Debian determine which drivers to install when first installing a fresh OS. You can use that output to your advantage.

sudo apt install $(nvidia-detector)

Look at what it proposes and accept if you agree

libnvidia-cfg1-535-server:amd64
libnvidia-compute-535-server:amd64
libnvidia-decode-535-server:amd64
linux-modules-nvidia-525-server-generic
linux-modules-nvidia-535-server-5.15.0-118-generic
linux-modules-nvidia-535-server-5.15.0-119-generic
linux-modules-nvidia-535-server-generic
linux-objects-nvidia-535-server-5.15.0-118-generic
linux-objects-nvidia-535-server-5.15.0-119-generic
linux-signatures-nvidia-5.15.0-118-generic
linux-signatures-nvidia-5.15.0-119-generic
nvidia-compute-utils-535-server
nvidia-firmware-535-server-535.183.06
nvidia-headless-no-dkms-525-server
nvidia-headless-no-dkms-535-server
nvidia-kernel-common-535-server
nvidia-kernel-source-535-server
nvidia-utils-535-server
kevin@plexmatrix:~$ nvidia-detector
nvidia-driver-550

Card is NVIDIA GeForce GTX 1070 Ti per nvidia-smi

Based on nvidia-detector I should be using the 550 drivers, but appear to be using the 535 drivers. What’s the best way to update these? apt isn’t upgrading (which I guess makes sense due to the packages being version specific.)

Ah, I see now that you provided a command using nvidia-detector as the output for an apt install.

I upgraded and had to reboot. As a note, I tried to run nvidia-smi prior to rebooting, and I got the same error I got before about a version mismatch (although, of course, this time the version is different, said 535 last time):

kevin@plexmatrix:~$ nvidia-smi
Failed to initialize NVML: Driver/library version mismatch
NVML library version: 550.107

After rebooting:

kevin@plexmatrix:~$ nvidia-smi
Fri Sep  6 21:54:47 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.107.02             Driver Version: 550.107.02     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1070 Ti     Off |   00000000:02:00.0 Off |                  N/A |
|  0%   44C    P8             11W /  180W |       2MiB /   8192MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+
kevin@plexmatrix:~$ dpkg -l | grep ^ii | grep nvidia | awk '{print $2}'
libnvidia-cfg1-550:amd64
libnvidia-common-550
libnvidia-compute-550:amd64
libnvidia-decode-550:amd64
libnvidia-egl-wayland1:amd64
libnvidia-encode-550:amd64
libnvidia-extra-550:amd64
libnvidia-fbc1-550:amd64
libnvidia-gl-550:amd64
linux-objects-nvidia-535-server-5.15.0-118-generic
linux-objects-nvidia-535-server-5.15.0-119-generic
linux-signatures-nvidia-5.15.0-118-generic
linux-signatures-nvidia-5.15.0-119-generic
nvidia-compute-utils-550
nvidia-dkms-550
nvidia-driver-550
nvidia-firmware-535-server-535.183.06
nvidia-firmware-550-550.107.02
nvidia-kernel-common-550
nvidia-kernel-source-550
nvidia-prime
nvidia-settings
nvidia-utils-550
screen-resolution-extra
xserver-xorg-video-nvidia-550

Should I do an apt remove on the 535 stuff and the generic linux signatures?

After installing the drivers (it knows to remove the old)
you have to update the initramfs and reboot so the kernel modules (DKMS) reload

sudo update-initramfs -c -k all to get all installed kernels

You can also look at sudo apt autoremove to remove old drivers

autoremove removed the firmware, but didn’t remove all the 535 stuff:

kevin@plexmatrix:~$ dpkg -l | grep ^ii | grep nvidia | awk '{print $2}'
libnvidia-cfg1-550:amd64
libnvidia-common-550
libnvidia-compute-550:amd64
libnvidia-decode-550:amd64
libnvidia-egl-wayland1:amd64
libnvidia-encode-550:amd64
libnvidia-extra-550:amd64
libnvidia-fbc1-550:amd64
libnvidia-gl-550:amd64
linux-objects-nvidia-535-server-5.15.0-118-generic
linux-objects-nvidia-535-server-5.15.0-119-generic
linux-signatures-nvidia-5.15.0-118-generic
linux-signatures-nvidia-5.15.0-119-generic
nvidia-compute-utils-550
nvidia-dkms-550
nvidia-driver-550
nvidia-firmware-550-550.107.02
nvidia-kernel-common-550
nvidia-kernel-source-550
nvidia-prime
nvidia-settings
nvidia-utils-550
screen-resolution-extra
xserver-xorg-video-nvidia-550

add a | grep 535 to that and find the 535 references

kevin@plexmatrix:~$ dpkg -l | grep ^ii | grep nvidia | grep 535 | awk '{print $2}'
linux-objects-nvidia-535-server-5.15.0-118-generic
linux-objects-nvidia-535-server-5.15.0-119-generic

just realized I installed the plain 550 driver and not the 550-server driver.
I ended up with a mix as well.

[chuck@glockner ~.1998]$ sudo dpkg -l | grep nvidia | grep 535
rc  libnvidia-compute-535-server:amd64     535.183.06-0ubuntu0.22.04.1              amd64        NVIDIA libcompute package
rc  nvidia-compute-utils-535-server        535.183.06-0ubuntu0.22.04.1              amd64        NVIDIA compute utilities
rc  nvidia-dkms-535-server                 535.183.06-0ubuntu0.22.04.1              amd64        NVIDIA DKMS package
ii  nvidia-firmware-535-server-535.183.06  535.183.06-0ubuntu0.22.04.1              amd64        Firmware files used by the kernel module
rc  nvidia-kernel-common-535-server        535.183.06-0ubuntu0.22.04.1              amd64        Shared files used with the kernel module
[chuck@glockner ~.1999]$

Let me sort this out and get you updated steps

Thanks. I went ahead and did a sudo apt install nvidia-driver-550-server

As a note, one of the final steps was rebuilding initramfs:

Setting up libnvidia-decode-550-server:amd64 (550.90.07-0ubuntu0.22.04.1) ...
Setting up nvidia-utils-550-server (550.90.07-0ubuntu0.22.04.1) ...
Setting up libnvidia-encode-550-server:amd64 (550.90.07-0ubuntu0.22.04.1) ...
Setting up nvidia-driver-550-server (550.90.07-0ubuntu0.22.04.1) ...
Processing triggers for initramfs-tools (0.140ubuntu13.4) ...
update-initramfs: Generating /boot/initrd.img-5.15.0-119-generic
Processing triggers for libc-bin (2.35-0ubuntu3.8) ...
Processing triggers for man-db (2.10.2-1) ...

update-initramfs: Generating /boot/initrd.img-5.15.0-119-generic

After doing that and an another apt autoremove, here is where I’m at (also verified that hw transcoding is working):

kevin@plexmatrix:~$ dpkg -l | grep ^ii | grep nvidia | awk '{print $2}'
libnvidia-cfg1-550-server:amd64
libnvidia-common-550-server
libnvidia-compute-550-server:amd64
libnvidia-decode-550-server:amd64
libnvidia-encode-550-server:amd64
libnvidia-extra-550-server:amd64
libnvidia-fbc1-550-server:amd64
libnvidia-gl-550-server:amd64
linux-objects-nvidia-535-server-5.15.0-118-generic
linux-objects-nvidia-535-server-5.15.0-119-generic
linux-signatures-nvidia-5.15.0-118-generic
linux-signatures-nvidia-5.15.0-119-generic
nvidia-compute-utils-550-server
nvidia-dkms-550-server
nvidia-driver-550-server
nvidia-firmware-550-server-550.90.07
nvidia-kernel-common-550-server
nvidia-kernel-source-550-server
nvidia-prime
nvidia-settings
nvidia-utils-550-server
screen-resolution-extra
xserver-xorg-video-nvidia-550-server
kevin@plexmatrix:~$ dpkg -l | grep ^ii | grep nvidia | grep 535 | awk '{print $2}'
linux-objects-nvidia-535-server-5.15.0-118-generic
linux-objects-nvidia-535-server-5.15.0-119-generic
kevin@plexmatrix:~$ nvidia-smi
Fri Sep  6 22:20:54 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07              Driver Version: 550.90.07      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1070 Ti     Off |   00000000:02:00.0 Off |                  N/A |
|  0%   43C    P8             11W /  180W |       2MiB /   8192MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+