Nvidia GPU does not decode properly in docker

OS:Ubuntu 22.04 with linux kernel 6.3

Using vgpu with grid driver and nvidia-docker installed correctly

caleb@docker:~$ nvidia-smi
Tue May  9 01:57:25 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GRID RTX6000-2Q     On   | 00000000:01:00.0 Off |                  N/A |
| N/A   N/A    P0    N/A /  N/A |    173MiB /  2048MiB |      4%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     13633      C   ...ib/jellyfin-ffmpeg/ffmpeg      173MiB |
+-----------------------------------------------------------------------------+
caleb@docker:~$ sudo docker exec -it plex nvidia-smi
[sudo] password for caleb:
Tue May  9 02:08:13 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GRID RTX6000-2Q     On   | 00000000:01:00.0 Off |                  N/A |
| N/A   N/A    P8    N/A /  N/A |      0MiB /  2048MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
caleb@docker:~$

LOG:
Plex Media Server Logs_2023-05-09_01-51-14.zip (908.1 KB)

@ChuckPa Any help plz

You’re throttled from too many requests.

That only happens if there are permission problems where /config resides and it can’t store the codecs after download.

May 09, 2023 01:22:25.380 [140203665996600] DEBUG - [Req#314/Transcode] [FFMPEG] - Loaded lib: libnvidia-encode.so.1
May 09, 2023 01:22:25.380 [140203665996600] DEBUG - [Req#314/Transcode] [FFMPEG] - Loaded sym: NvEncodeAPICreateInstance
May 09, 2023 01:22:25.380 [140203665996600] DEBUG - [Req#314/Transcode] [FFMPEG] - Loaded sym: NvEncodeAPIGetMaxSupportedVersion
May 09, 2023 01:22:25.559 [140203665996600] INFO - [Req#314/Transcode] CodecManager: obtaining decoder 'hevc'
May 09, 2023 01:22:25.559 [140203665996600] DEBUG - [Req#314/Transcode/HCl#b9] HTTP requesting GET https://plex.tv/api/codecs/hevc_decoder?build=linux-x86_64-standard&deviceId=d7630480-0556-4bca-9a9b-c8648460f36c&oldestPreviousVersion=1%2E31%2E1%2E6733-bc0674160&version=e51a01b-4528
May 09, 2023 01:22:25.890 [140203688287032] DEBUG - [HttpClient/HCl#b9] HTTP/2.0 (0.3s) 429 response from GET https://plex.tv/api/codecs/hevc_decoder?build=linux-x86_64-standard&deviceId=d7630480-0556-4bca-9a9b-c8648460f36c&oldestPreviousVersion=1%2E31%2E1%2E6733-bc0674160&version=e51a01b-4528 (reused)
May 09, 2023 01:22:25.890 [140203661777720] ERROR - [Req#314/Transcode] Codecs: Failed to download XML for codec 'hevc_decoder'
May 09, 2023 01:22:25.890 [140203665996600] DEBUG - [Req#314/Transcode] Codecs: testing hevc (decoder) with hwdevice vaapi
May 09, 2023 01:22:25.890 [140203665996600] WARN - [Req#314/Transcode] Failed to find decoder 'hevc'
May 09, 2023 01:22:25.890 [140203665996600] INFO - [Req#314/Transcode] CodecManager: obtaining decoder 'hevc'
May 09, 2023 01:22:25.890 [140203665996600] DEBUG - [Req#314/Transcode/HCl#ba] HTTP requesting GET https://plex.tv/api/codecs/hevc_decoder?build=linux-x86_64-standard&deviceId=d7630480-0556-4bca-9a9b-c8648460f36c&oldestPreviousVersion=1%2E31%2E1%2E6733-bc0674160&version=e51a01b-4528
May 09, 2023 01:22:26.217 [140203688287032] DEBUG - [HttpClient/HCl#ba] HTTP/2.0 (0.3s) 429 response from GET https://plex.tv/api/codecs/hevc_decoder?build=linux-x86_64-standard&deviceId=d7630480-0556-4bca-9a9b-c8648460f36c&oldestPreviousVersion=1%2E31%2E1%2E6733-bc0674160&version=e51a01b-4528 (reused)
May 09, 2023 01:22:26.217 [140203661777720] ERROR - [Req#314/Transcode] Codecs: Failed to download XML for codec 'hevc_decoder'
May 09, 2023 01:22:26.217 [140203665996600] DEBUG - [Req#314/Transcode] Codecs: testing hevc (decoder) with hwdevice nvdec
May 09, 2023 01:22:26.217 [140203665996600] WARN - [Req#314/Transcode] Failed to find decoder 'hevc'
May 09, 2023 01:22:26.218 [140203665996600] DEBUG - [Req#314/Transcode] Codecs: testing h264_nvenc (encoder)
May 09, 2023 01:22:26.218 [140203665996600] DEBUG - [Req#314/Transcode] Codecs: hardware transcoding: testing API nvenc
May 09, 2023 01:22:26.218 [140203665996600] DEBUG - [Req#314/Transcode] [FFMPEG] - Loaded lib: libcuda.so.1
May 09, 2023 01:22:26.218 [140203665996600] DEBUG - [Req#314/Transcode] [FFMPEG] - Loaded sym: cuInit

l#b9] HTTP/2.0 (0.3s) 429 response from GET https://plex.tv/api/codecs/hevc_decoder?build=linux-x86_64-standard&deviceId=d7630480-0556-4bca-9a9b-c8648460f36c&oldestPreviousVersion=1.31.1.6733-bc0674160&version=e51a01b-4528 (reused)
May 09, 2023 01:22:25.890 [140203661777720] ERROR - [Req#314/Transcode] Codecs: Failed to download XML for codec ‘hevc_decoder’

Side note: I’ve never seen two dockers share a GPU correctly. (Different namespaces)

Putting both servers in the same namespace (e.g. on the main host) will work

I am trying to give 777 permissions to the config directory. Is this going to work?

What does it mean that two dockers cannot share a GPU?

Are there any solutions?

You don’t need to use Docker.
No functionality is gained.

Run both, native on the host, and be done with it.

I tried to deal with it, thank you.

The reason I use docker is that it is easy to manage and easy to see the error log.

Is this error caused by docker?

I just installed plexmediaserver in host, but it seems that it still can’t transcode normally. The following is the log.

Plex Media Server Logs_2023-05-09_17-38-51.zip (666.3 KB)

What I’m seeing here does not require video transcoding. It only requires audio.

May 09, 2023 17:35:10.518 [139713748359992] DEBUG - [JobRunner] Job running: FFMPEG_EXTERNAL_LIBS=‘/var/lib/plexmediaserver/Library/Application\ Support/Plex\ Media\ Server/Codecs/1378972-4547-linux-x86_64/’ X_PLEX_TOKEN=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx “/usr/lib/plexmediaserver/Plex Transcoder” -codec:1 ac3 -analyzeduration 20000000 -probesize 20000000 -i ‘/video/电视剧/不死法医 (2014)/Season 1/不死法医 - S01E02 - 三思而后行.mkv’ -filter_complex “[0:1] aresample=async=1:ochl=‘5.1’:rematrix_maxval=0.000000dB:osr=48000[0]” -map “[0]” -metadata:s:0 language=eng -codec:0 flac -b:0 4096k -f flac -map_metadata -1 -map_chapters -1 -t 1292.48 “/var/lib/plexmediaserver/Library/Application Support/Plex Media Server/Cache/Transcode/Detection/a08d04ec-4618-400d-9110-07bbf6b979ac” -y -nostats -loglevel quiet -loglevel_plex error -progressurl http://127.0.0.1:32400/video/:/transcode/session/d187d4da-a5c9-4cee-851e-fa0eb2a719c5/b301c2fb-c65f-4297-bf20-6cf65bb236f4/progress

You need to play something, with a setting, which forces a video transcode

Sorry, it still doesn’t seem to be playing correctly. Here are the latest logs.
Plex Media Server Logs_2023-05-10_01-42-41.zip (1.2 MB)
Here’s the video.
Record_2023-05-10-09-44-21_40deb401b9ffe8e1df2f1cc5ba480b12.zip (5.1 MB)

This all looks very odd, whether I’m using docker or not.

First thing I see in your logs is the UHD-770 QSV is not found.
(Intel Media Driver issue)

May 10, 2023 01:41:02.315 [140297704700728] DEBUG - [GPU] Got device: TU102GL [Quadro RTX 6000/8000], nvidia@unknown, default true, best true, ID /dev/dri/renderD128, DevID [10de:1e30:10de:1326], flags 0x70

Your Nvidia card is found but notice it’s at renderD128. This is where QSV should be.

Normal mapping, due to how the kernel starts,

  • QSV = D128
  • Nvidia = D129

Do you have these packages installed ?

ii libnvidia-decode-525-server:amd64 525.105.17-0ubuntu0.20.04.1 amd64 NVIDIA Video Decoding runtime libraries
ii libnvidia-encode-525-server:amd64 525.105.17-0ubuntu0.20.04.1 amd64 NVENC Video Encoding runtime library

Without them, PMS will see the card but won’t be able to use it.

Here is the confirmation PMS can’t see either the QSV or the Nvidia.

I have an updated PMS with the new Intel Media Driver.
I will provide that to you if you want to use it.

I cannot speak to the nvidia. I only know our RTX 3040 Nvidia works as it should

I am using a VM to run plex media server.

UHD770 is not present in my VM, so Nvidia GPU should be renderD128.

00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
00:1a.0 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #4 (rev 03)
00:1a.1 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #5 (rev 03)
00:1a.2 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #6 (rev 03)
00:1a.7 USB controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #2 (rev 03)
00:1b.0 Audio device: Intel Corporation 82801I (ICH9 Family) HD Audio Controller (rev 03)
00:1c.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:1c.1 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:1c.2 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:1c.3 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:1d.0 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 03)
00:1d.1 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 (rev 03)
00:1d.2 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3 (rev 03)
00:1d.7 USB controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 (rev 03)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92)
00:1f.0 ISA bridge: Intel Corporation 82801IB (ICH9) LPC Interface Controller (rev 02)
00:1f.2 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] (rev 02)
00:1f.3 SMBus: Intel Corporation 82801I (ICH9 Family) SMBus Controller (rev 02)
01:00.0 VGA compatible controller: NVIDIA Corporation TU102GL [Quadro RTX 6000/8000] (rev a1)
05:01.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge
05:02.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge
05:03.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge
05:04.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge
06:03.0 Unclassified device [00ff]: Red Hat, Inc. Virtio memory balloon
06:12.0 Ethernet controller: Red Hat, Inc. Virtio network device
06:13.0 Ethernet controller: Red Hat, Inc. Virtio network device
09:01.0 SCSI storage controller: Red Hat, Inc. Virtio SCSI
09:02.0 SCSI storage controller: Red Hat, Inc. Virtio SCSI

I’m using vGPU, and when I install libnvidia-encode, libnvidia-decode, my grid driver is automatically deleted, so I can’t install these two packages.

And my video card is not RTX6000, it is virtualized as RTX6000, it is actually 1660 super (vgpu_unlock).

00:00.0 Host bridge: Intel Corporation Device a704 (rev 01)
00:01.0 PCI bridge: Intel Corporation Device a70d (rev 01)
00:02.0 VGA compatible controller: Intel Corporation Device a780 (rev 04)
00:02.1 VGA compatible controller: Intel Corporation Device a780 (rev 04)
00:02.2 VGA compatible controller: Intel Corporation Device a780 (rev 04)
00:02.3 VGA compatible controller: Intel Corporation Device a780 (rev 04)
00:02.4 VGA compatible controller: Intel Corporation Device a780 (rev 04)
00:02.5 VGA compatible controller: Intel Corporation Device a780 (rev 04)
00:02.6 VGA compatible controller: Intel Corporation Device a780 (rev 04)
00:02.7 VGA compatible controller: Intel Corporation Device a780 (rev 04)
00:06.0 PCI bridge: Intel Corporation Device a74d (rev 01)
00:0a.0 Signal processing controller: Intel Corporation Device a77d (rev 01)
00:14.0 USB controller: Intel Corporation Device 7ae0 (rev 11)
00:14.2 RAM memory: Intel Corporation Device 7aa7 (rev 11)
00:15.0 Serial bus controller [0c80]: Intel Corporation Device 7acc (rev 11)
00:16.0 Communication controller: Intel Corporation Device 7ae8 (rev 11)
00:17.0 SATA controller: Intel Corporation Device 7ae2 (rev 11)
00:1a.0 PCI bridge: Intel Corporation Device 7ac8 (rev 11)
00:1c.0 PCI bridge: Intel Corporation Device 7ab8 (rev 11)
00:1c.4 PCI bridge: Intel Corporation Device 7abc (rev 11)
00:1d.0 PCI bridge: Intel Corporation Device 7ab7 (rev 11)
00:1f.0 ISA bridge: Intel Corporation Device 7a86 (rev 11)
00:1f.3 Audio device: Intel Corporation Device 7ad0 (rev 11)
00:1f.4 SMBus: Intel Corporation Device 7aa3 (rev 11)
00:1f.5 Serial bus controller [0c80]: Intel Corporation Device 7aa4 (rev 11)
01:00.0 VGA compatible controller: NVIDIA Corporation TU116 [GeForce GTX 1660 SUPER] (rev a1)
01:00.1 Audio device: NVIDIA Corporation TU116 High Definition Audio Controller (rev a1)
01:00.2 USB controller: NVIDIA Corporation TU116 USB 3.1 Host Controller (rev a1)
01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU116 USB Type-C UCSI Controller (rev a1)
02:00.0 Non-Volatile memory controller: Sandisk Corp Device 5019 (rev 01)
04:00.0 Non-Volatile memory controller: Silicon Motion, Inc. SM2263EN/SM2263XT SSD Controller (rev 03)
05:00.0 PCI bridge: ASMedia Technology Inc. Device 1806 (rev 01)
06:00.0 PCI bridge: ASMedia Technology Inc. Device 1806 (rev 01)
06:02.0 PCI bridge: ASMedia Technology Inc. Device 1806 (rev 01)
06:06.0 PCI bridge: ASMedia Technology Inc. Device 1806 (rev 01)
06:0e.0 PCI bridge: ASMedia Technology Inc. Device 1806 (rev 01)
07:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)
08:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)
09:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)
0a:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)
0b:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 04)

Do you mean to say that I should use plexmediaserver in host? This will contaminate the environment of the host.

Because opencl didn’t work properly when the uhd 770 was running in sriov mode, I replaced the nvidia graphics card.

Is it possible that the grid driver broke everything?

Anything is possible with your special case.
I can play the testfile on our machine without issue. There are no special drivers installed.

Has no one in the forum tried using the grid driver?

Is there anything else in the log that indicates what went wrong with the encoder?

Grid card? Be specific please ?

From your logs you showed me , it’s a detection problem. It’s not encoding.

This is not the nvidia grid graphics card (K1, K2), where the grid driver refers to the vgpu software.

The grid driver is just a special driver for the vgpu.

There is no difference in the core between Nvidia’s professional-grade and consumer-grade graphics cards, and the difference between them comes mainly from the drivers.

All I did was spoof the geforce 1660 super as an RTX6000 to trick the driver into thinking it could open the vgpu, which is a community solution.

It seems that there is no detection problem in docker? It’s very strange.