HW acceleration correctly chooses NVIDIA discrete GPU, but tries to use vaapi instead of NVDEC/NVENC

Server Version#: 1.30.2.6563
Player Version#: Android 9.14.0.37895 / Samsung 5.53.5 Platform v4 / Web player (Firefox) on machine hosting PMS

My PMS runs on a Linux machine with an Intel i7-6700HQ CPU and an NVIDIA GTX 1060 GPU. I had been using HW Acceleration for a few months with (apparently) no issues. I started noticing issues about 1.5 months ago, where some videos would just not play on some of my devices, displaying an error message on the player. Eventually I realized this was only happening with videos that had to be transcoded. Further investigation showed what appeared to be PMS attempting to use the NVIDIA GPU, but with VA-API (see plex-logs-without-forced-device.log attached), failing to do so and then not falling back on other transcoding methods.

After reading through other forum posts, I set HardwareDevicePath="/dev/dri/renderD129" in my Preferences.xml. This still had PMS attempting to use VA-API, but now it fell back to software transcoding after failing to use VA-API (see plex-logs-with-forced-device.log attached).

Finally, I removed the HardwareDevicePath setting, disabled the nvidia kernel module and restarted PMS. This time, PMS used the Intel iGPU for HW transcoding with no issues.

Unfortunately, when I originally enabled HW acceleration and it was working, I set it and forgot about it, so I can’t really say if it was using the iGPU, the NVIDIA GPU or always falling back to SW transcoding. I’m also not sure which version of PMS I was using when I first noticed the issue, nor what the last working version was.

One last piece of information: I’ve seen a few posts recommending disabling the iGPU from the BIOS, but my BIOS doesn’t appear to have that option, so that’s a no-go. I hope this is enough information, but if it’s not, I’ll be glad to provide whatever else you may need.

NVIDIA Driver version: 525.85.05

plex-logs-with-forced-device.log (7.9 KB)
plex-logs-without-forced-device.log (1.8 KB)

1 Like

FIrst

The transcoder doesn’t know what the device is when it starts.

Probe sequence is VAAPI then Nvidia

Second

The Intel Core SkyLake (-6xxx) CPUs cannot transcode HEVC HDR to SDR .

Without seeing your actual DEBUG logs of this, it’s impossible to know definitively.

Request

Please make certain DEBUG logging is enabled (Server) and Verbose is disabled.

Now recreate the playback failure.

Download the logs ZIP file from PMS and attach here.

Thanks for the quick response!

I’ve attached the requested logs. I’ve tagged them accordingly, whether they had HardwareDevicePath set or not. Like my previous test cases, the first set of logs is for the preference unset and PMS basically refuses to play the media file or fall back on SW transcoding. The second set is with the device preference set, and PMS falls back to SW transcoding.

I’ve added a third set of logs where, after restarting the server a few times, it appeared to actually use the NVIDIA GPU to transcode, as evidenced by the output of nvidia-smi, so I’m very confused now:

Fri Jan 27 08:01:01 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.05    Driver Version: 525.85.05    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   63C    P2    42W /  78W |    184MiB /  6144MiB |      6%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      7719      G   /usr/lib/xorg/Xorg                  4MiB |
|    0   N/A  N/A     61228      C   ...diaserver/Plex Transcoder      175MiB |
+-----------------------------------------------------------------------------+

I don’t believe any of my content is HDR. I for sure have HEVC content, but no HDR. I’ll double-check later today, regardless.

Do let me know if you need anything else. Thanks!

Plex Media Server Logs_2023-01-27_07-51-03.zip (HardwareDevicePath unset)|attachment (143.0 KB)
Plex Media Server Logs_2023-01-27_07-55-16.zip (HardwareDevicePath set)|attachment (293.7 KB)
Plex Media Server Logs_2023-01-27_08-04-55.zip (HardwareDevicePath set, working?)|attachment (396.6 KB)

When you have an Intel CPU with QSV. the QSV will show up as renderD128 (the default) unless in a ESXi VM (they swap)

The Nvidia GPU will be /dev/dri/renderD129

Please observe here:

Jan 27, 2023 07:54:04.282 [0x7f1462bf3b38] DEBUG - [GPU] Got device: GP106M [GeForce GTX 1060 Mobile], nvidia@unknown, default true, best true, ID /dev/dri/renderD129, DevID [10de:1c20:1462:11ac], flags 0xe
Jan 27, 2023 07:54:04.282 [0x7f1462bf3b38] DEBUG - [GPU] Got device: HD Graphics 530, intel@builtin, default false, best false, ID /dev/dri/renderD128, DevID [8086:191b:1462:11ac], flags 0xca
Jan 27, 2023 07:54:04.282 [0x7f1462bf3b38] INFO - Preemptively preparing driver icr for GPU HD Graphics 530

This segment right here

GPU HD Graphics 530

is the evidence of the SkyLake. It’s fine for everything up, but not including, HEVC HDR or AV1.

You need KabyLake (630) for HEVC HDR and AlderLake (730) for AV1.

Jan 27, 2023 08:00:03.277 [0x7f1461acab38] DEBUG - [Req#44e6/Transcode/983183dc0aa35dca-com-plexapp-android] TPU: hardware transcoding: final decoder: nvdec, final encoder: nvenc
Jan 27, 2023 08:00:03.277 [0x7f1461acab38] DEBUG - [Req#44e6/Transcode/983183dc0aa35dca-com-plexapp-android/JobRunner] Job running: CUDA_CACHE_PATH="/var/lib/plexmediaserver/Library/Application Support/Plex Media Server/Cache/Shaders/CUDA" FFMPEG_EXTERNAL_LIBS='/var/lib/plexmediaserver/Library/Application\ Support/Plex\ Media\ Server/Codecs/acf6c67-4446-linux-x86_64/' X_PLEX_TOKEN=xxxxxxxxxxxxxxxxxxxx4e3c-afe5-a8a046343654 "/usr/lib/plexmediaserver/Plex Transcoder" -codec:0 h264 -hwaccel:0 nvdec -hwaccel_fallback_threshold:0 10 -threads:0 1 -hwaccel_output_format:0 cuda -hwaccel_device:0 cuda -codec:1 aac -ss 2832 -analyzeduration 20000000 -probesize 20000000 -i "/media/Media/JDownloads/Plex/Movies/Space Cowboys (2000)/Space.Cowboys.(2000).Bluray-1080p.x264.AAC.mkv" -ss 2832 -analyzeduration 20000000 -probesize 20000000 -i /home/gazy/plextmp/Transcode/Sessions/plex-transcode-983183dc0aa35dca-com-plexapp-android-1d932285-f984-4a68-8678-03cae2989491/temp-0.srt -filter_complex "[0:0]hwupload[0];[0]scale_cuda=w=720:h=300:format=nv12[1]" -map "[1]" -codec:0 h264_nvenc -b:0 944k -maxrate:0 1259k -bufsize:0 2518k -forced-idr:0 1 -r:0 23.975999999999999 -force_key_frames:0 "expr:gte(t,n_forced*8)" -filter_complex "[0:1] aresample=async=1:ochl='stereo':rematrix_maxval=4.000000dB:osr=48000[2]" -map "[2]" -metadata:s:1 language=eng -codec:1 libopus -b:1 161k -map 1:s:0 -metadata:s:2 language=spa -codec:2 ass -strict_ts:2 0 -map "0:t?" -codec:t copy -segment_format matroska -f ssegment -individual_header_trailer 0 -flags +global_header -segment_header_filename header -segment_time 8 -segment_start_number 354 -segment_copyts 1 -segment_time_delta 0.0625 -segment_list "http://127.0.0.1:32400/video/:/transcode/session/983183dc0aa35dca-com-plexapp-android/1d932285-f984-4a68-8678-03cae2989491/manifest?X-Plex-Http-Pipeline=infinite" -segment_list_type csv -segment_list_size 5 -segment_list_separate_stream_times 1 -segment_list_unfinished 1 -segment_format_options output_ts_offset=10 -max_delay 5000000 -avoid_negative_ts disabled -map_metadata:g -1 -map_metadata:c -1 -map_chapters -1 "media-%05d.ts" -start_at_zero -copyts -init_hw_device cuda=cuda: -filter_hw_device cuda -y -nostats -loglevel quiet -loglevel_plex error -progressurl http://127.0.0.1:32400/video/:/transcode/session/983183dc0aa35dca-com-plexapp-android/1d932285-f984-4a68-8678-03cae2989491/progress

To close this out.

  1. When the transcoder starts, it only knows ā€œ/dev/dri/renderD129ā€
  2. It doesn’t know if Intel or Nvidia
  3. It tries Intel first (why you see the messages)
  4. It then goes to the Nvidia.

This is how it works. Don’t let yourself get caught thinking there are problems when there aren’t. You’re just seeing the nitty-gritty in the logs.

I want to clarify, since I don’t think I’m explaining myself properly.

  1. 99% of my stuff is 1080p H.264 (AVC). What little HEVC content I have is not HDR (preference left over from before I paid Plex Pass)
  2. You said

a. With no changes to my Preferences.xml, this is not the case for me. PMS chooses D129, attempts VAAPI, then just errors out and the player shows an error about being unable to play the media.

b. If I force D129 in Preferences, it tries VAAPI, then NVENC/NVDEC. I tested multiple times with the same file. Sometimes it works, sometimes it falls back to SW transcoding.

  1. If I disable my NVIDIA GPU altogether (rmmod nvidia...), D129 disappears as expected, PMS chooses D128, and successfully uses the Intel GPU for transcoding.

However you look at it, situation 2a is not properly doing fallback to other encoding methods. Playback stops entirely. Situation 2b works intermittently, at best. There is a problem somewhere. As I mentioned in (1) most of my content is AVC, and that’s the way it’s been for the past year or so. HW acceleration had been working fine, as far as I could tell, until about a month and a half ago. I can’t be 100% certain if HW acceleration was being used, but it was enabled and I never had any issues playing any media on any device, so either HW acceleration was working or at least the fallback logic was. Now, something isn’t working right.

1 Like

Please do this for me?

  1. Open your Preferences.xml file.
  2. Find ā€œPlexOnlineTokenā€ and copy the value
  3. Open a new browser tab
  4. Now we construct a URL in the address bar to query PMS
http://ip.addr.of.PMS:32400/:/prefs?X-Plex-Token=PASTE_TOKEN_HERE

What you’ll get back is a lot of XML. What we’re interested in is this line:

<Setting id="HardwareDevicePath" label="Hardware transcoding device" summary="The GPU or other hardware device that will be used for transcoding" type="text" default="" value="" hidden="1" advanced="0" group="transcoder" enumValues=":Auto|nvidia@/dev/dri/renderD128:GP106GL [Quadro P2200]"/>

Please share with me the HardwareDevicePath line.

I haven’t asked before – Is PMS in a VM ?

1 Like

Here’s the HardwareDevicePath line

<Setting id="HardwareDevicePath" label="Hardware transcoding device" summary="The GPU or other hardware device that will be used for transcoding" type="text" default="" value="" hidden="1" advanced="0" group="transcoder" enumValues=":Auto|nvidia@/dev/dri/renderD129:GP106M [GeForce GTX 1060 Mobile]|intel@/dev/dri/renderD128:HD Graphics 530"/>

PMS is not running on a VM. It’s an old gaming laptop, and PMS is running directly on the OS as a systemd service (no Docker or any other container runtime)

@ChuckPa

Is there any way to remove VAAPI and use only NVIDIA ?

I noticed there is always some delay when starting transcodes on NVIDIA, which does not happen with Intel GPU. And during the delay I see in the logs Codecs: hardware transcoding: testing API vaapi multiple times. Seems it’s trying VAAPI a few times on every transcode before settling on NVIDIA.

1 Like

@gmahomarf

Please show me the output of

sudo lshw -C display

This will show all display-type devices found by the kernel

For my NUC8 (Hades Canyon) with Radeon and Intel GPUs, it looks like this:

root@lizum:/home/chuck# lshw -C display
  *-display                 
       description: VGA compatible controller
       product: Polaris 22 XT [Radeon RX Vega M GH]
       vendor: Advanced Micro Devices, Inc. [AMD/ATI]
       physical id: 0
       bus info: pci@0000:01:00.0
       logical name: /dev/fb0
       version: c0
       width: 64 bits
       clock: 33MHz
       capabilities: pm pciexpress msi vga_controller bus_master cap_list rom fb
       configuration: depth=32 driver=amdgpu latency=0 mode=3840x2160 visual=truecolor xres=3840 yres=2160
       resources: iomemory:200-1ff iomemory:210-20f irq:191 memory:2000000000-20ffffffff memory:2100000000-21001fffff ioport:e000(size=256) memory:db500000-db53ffff memory:c0000-dffff
  *-display
       description: Display controller
       product: HD Graphics 630
       vendor: Intel Corporation
       physical id: 2
       bus info: pci@0000:00:02.0
       version: 04
       width: 64 bits
       clock: 33MHz
       capabilities: pciexpress msi pm bus_master cap_list
       configuration: driver=i915 latency=0
       resources: iomemory:2f0-2ef iomemory:2f0-2ef irq:190 memory:2ffe000000-2ffeffffff memory:2fa0000000-2fafffffff ioport:f000(size=64)
root@lizum:/home/chuck#

Look at the PCI addresses. /dev/dri is enumerated based on bus address

  1. UHD630 = pci@0000:00:02.0 → renderD128
  2. Polaris 22 XT = pci@0000:01:00.0 → renderD129

This is exactly what we get

root@lizum:/home/chuck# ls -la /dev/dri
total 0
drwxr-xr-x   3 root root        140 Jan 14 01:39 .
drwxr-xr-x  21 root root       5160 Jan 27 13:38 ..
drwxr-xr-x   2 root root        120 Jan 14 01:39 by-path
crw-rw----+  1 root render 226,   0 Jan 14 01:39 card0
crw-rw----+  1 root render 226,   1 Jan 27 11:08 card1
crw-rw----+  1 root render 226, 128 Jan 14 01:39 renderD128
crw-rw----+  1 root render 226, 129 Jan 14 01:39 renderD129
root@lizum:/home/chuck# 

On this nuc, PMS works without specifying HardwareDevicePath because it always finds the Intel QSV at renderD128 (the default path it’s looking for)

@Ossalingur

We’ve had the discussion a few times about providing this capability.
We decided against it because of how people like to ā€˜tinker’ with their settings and sooner or later (likely sooner) someone will start complaining their transcoding is BROKEN … all because they don’t know what they’re doing and selected the wrong one.

I’m willing to bring it up again but, before I do, I think we need a clear ā€œhow it worksā€ definition of the logic flow.

  1. There must be a fully automatic setting – where it probes and finds out what’s there.

  2. If the desired device fails qualification ( ā€œCan it transcode this video?ā€) what should the fallback behavior be?

2 Likes

Here’s the output

āÆ sudo lshw -C display
  *-display                 
       description: VGA compatible controller
       product: GP106M [GeForce GTX 1060 Mobile]
       vendor: NVIDIA Corporation
       physical id: 0
       bus info: pci@0000:01:00.0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress vga_controller bus_master cap_list rom
       configuration: driver=nvidia latency=0
       resources: irq:137 memory:de000000-deffffff memory:b0000000-bfffffff memory:c0000000-c1ffffff ioport:e000(size=128) memory:df000000-df07ffff
  *-display
       description: VGA compatible controller
       product: HD Graphics 530
       vendor: Intel Corporation
       physical id: 2
       bus info: pci@0000:00:02.0
       logical name: /dev/fb0
       version: 06
       width: 64 bits
       clock: 33MHz
       capabilities: pciexpress msi pm vga_controller bus_master cap_list rom fb
       configuration: depth=32 driver=i915 latency=0 resolution=1920,1080
       resources: irq:128 memory:dd000000-ddffffff memory:70000000-7fffffff ioport:f000(size=64) memory:c0000-dffff

And, as you can see, the PCI addresses for the Intel GPU and the NVIDIA GPU match yours (s/NVIDIA/AMD/, of course):

āÆ ll /dev/dri/by-path/          
total 0
lrwxrwxrwx 1 root root  8 Jan 26 22:27 pci-0000:00:02.0-card -> ../card0
lrwxrwxrwx 1 root root 13 Jan 26 22:27 pci-0000:00:02.0-render -> ../renderD128
lrwxrwxrwx 1 root root  8 Jan 26 22:27 pci-0000:01:00.0-card -> ../card1
lrwxrwxrwx 1 root root 13 Jan 26 22:27 pci-0000:01:00.0-render -> ../renderD129

Intel is renderD128, nvidia is renderD129

That’s exactly as I expect

When you add `HardwareDevicePath=ā€œ/dev/dri/renderD12Xā€ (X=8 or 9) to Preferences.xml, does it stop probing around?

(Ignore the frame testing spam … I’m already working on getting that reduced)

@ChuckPa I think how it works in JellyFin is really awesome.

They give you the option to select the default type of device, NVIDIA, INTEL, and some others. But when the selected option fails, it starts probing for all devices, and updates the default to whatever worked. So next time, it uses that device. So you have the option to select, but don’t ever really have to change it manually, unless you have multiple GPU.

JF starts transcoding fast on both, while NVIDIA is noticably faster on JF than Plex. Intel is exactly the same speed on Plex & JF.

3 Likes

When I set it to D128, it immediately probes that device for VAAPI compatibility and begins HW transcoding seamlessly. When set to D129, it checks that device for VAAPI, then NVDEC/NVENC, and then it can either begin HW transcoding then and there or fall back to SW transcoding. (The logs here, from one of my previous posts, show the behavior when Device is set to D129, on different attempts at transcoding the same file).

EDIT:

I just reread this part

It seems my PMS defaults to D129 when HardwareDevicePath is unset

@Ossalingur

I hear you but please don’t try to coerce me into pushing for a tit-for-tat ā€œJust because JellyFin doesā€ argument.

I’m an engineer but here I’m just support. I HEAR YOU

All I can do is keep trying.

2 Likes

You have a weird machine! Linux isn’t supposed to work that way!!!

:stuck_out_tongue:

*sigh Story of my life. The stories I could tell you about how many of my devices/apps work in ways they’re not supposed to…

@gmahomarf

Maybe this is a PEBKAC scenario?

:rofl:
:rofl:

I’ve had those days too :see_no_evil:

Honestly? A laptop as a PMS server? You don’t want to know my machine

If it is PEBKAC, I’ll gladly accept the blame, wear the cone of shame and all that, but I’d like to know what I’m actually doing wrong first :rofl:.

I mean, it was just lying around, and I’ve never been a snob for streaming quality (at one point I even considered hosting PMS on a Raspberry Pi with external storage). I really only needed to be able to watch my stuff on the TV and on the go. Even HW Acceleration was just a bonus (I got the Plex Pass because I like to support products that I actively use, not for any particular feature). I just figured, I’ve got that GPU doing nothing, might as well make it work, ya know? If we end up chalking this entire thing up to ĀÆ\_(惄)_/ĀÆ, so be it. I had to try, at least

2 Likes

While i work (part time; retired) for Plex, the engineer in me always wins. :slight_smile:
110TB , full quality (image) UHD BluRay rips, enough said? LOL

I think we can figure out what’s happening there but I never trust a laptop.

  1. The bios/firmware can be wonky – usually is highly customized
  2. The maker pulled crazy tricks on the PCI bus to get it all to fit in the box.

I have 3 NUCs, QNAP i7-7700, my Xeon with P2200 and they all work exactly the same even though the QNAP has a different OS. Our Lab has Synology boxes and they are the same too. → ALL QSV-capable machines come up with the GPU/QSV ASIC at renderD128

That occurs because of the kernel’s probe and init routines.

(I also used to do kernel work. You can’t probe the PCI bus until you know where it is… so you probe the entire CPU first)