ESXi / NVIDIA GPU Passthrough / Ubuntu Server 18.04 Server

Server Version#: Ubuntu 18.04.3 LTS (GNU/Linux 4.15.0-74-generic x86_64)
Player Version#: plexmediaserver_1.18.5.2260-056ab4be9_amd64

As the title suggests, I installed a GTX 1050 into my DELL R720
I’ve followed multiple guides on how to get hardware transcoding working, but it just doesn’t seem to want to.

The card is detected by the OS
lspci | grep VGA
00:0f.0 VGA compatible controller: VMware SVGA II Adapter
0b:00.0 VGA compatible controller: NVIDIA Corporation GP107 [GeForce GTX 1050] (rev a1)

But PLEX doesn’t seem to recognize it:
Jan 22, 2020 08:38:40.202 [0x7f7a50ff9700] DEBUG - Codecs: hardware transcoding: testing API nvenc
Jan 22, 2020 08:38:40.328 [0x7f7a50ff9700] DEBUG - Codecs: hardware transcoding: opening hw device failed - probably not supported by this system, error: Unknown error occurred

Jan 22, 2020 08:38:41.355 [0x7f7aa6ffd700] DEBUG - TPU: hardware transcoding: enabled, but no hardware decode accelerator found
Jan 22, 2020 08:38:41.356 [0x7f7aa6ffd700] DEBUG - TPU: hardware transcoding: zero-copy support not present
Jan 22, 2020 08:38:41.356 [0x7f7aa6ffd700] DEBUG - TPU: hardware transcoding: final decoder: , final encoder:

I’d like to stay away from using a docker if at all possible.

Any help would be appreciated.

Thanks

Did you install the Nvidia drivers and let them go to their default location?
Which version of the drivers are you using?

The player won’t be 1.18.5, that’s a server version number.

If you have not rebooted the system since installing PMS 1.18.5, you should find /tmp/plexinstaller.log still exists.

May I see it please?

Yes and:

dkms status
nvidia, 440.33.01, 4.15.0-74-generic, x86_64: installed

Rebooted since install, but installed again:

cat /tmp/plexinstaller.log

# Plex Media Server installation configuration info:  Thu Jan 23 02:32:04 UTC 2020
Init=0
Systemd=1
LinuxContainer=0
NewInstall=0
HaveOverride=0
OverrideFile=""
PlexUser="plex"
PlexGroup="plex"
VideoGroup="video"
AppSuppDir="/var/lib/plexmediaserver/Library/Application Support"
PlexTempDir="/var/lib/plexmediaserver/tmp_transcoding"
LangEncoding="en_US.UTF-8"
ExistingVersion=11805
HaveHardware=1
NeedUser=0
NeedGroup=0
NeedVideo=0
Verbose=1
IsRunning=1

This is a fresh install, no not much done to it aside from drivers and plex

How to read that is pretty simple. The obvious names are just that.

HaveHardware=1 tells me the CPU has QSV transcoding visible at /dev/dri/renderD128.

The installation script is 100% passive so cannot tell if it works. It trusts the kernel wouldn’t have listed it if not.

The challenge to verify is whether or not file /dev/dri/renderD128 is a member of the video group & user plex getent group plex also reports video.

Where this process will fails is when

  1. The CPU is capable of QSV - Kernel assigns first at renderD128
  2. A graphics card is inserted - Kernel will assign at renderD129

I didn’t want to get into detecting if multiple devices (CPU & GPU) are present.
My intent was to report if something was visible & accessible by the rendering video group at the default location.

What does ls -la /dev/dri show?

the plex group doesn’t have the video user in it, but the video group does have plex, is that what you’re looking for?

$ getent group plex
plex:x:999:
$ getent group video
video:x:44:plex
$ ls -la /dev/dri
total 0
drwxr-xr-x  3 root root       140 Jan 23 02:22 .
drwxr-xr-x 18 root root      4020 Jan 23 02:22 ..
drwxr-xr-x  2 root root       120 Jan 23 02:22 by-path
crw-rw----  1 root video 226,   0 Jan 23 02:32 card0
crw-rw----  1 root video 226,   1 Jan 23 02:32 card1
crw-rw----  1 root video 226, 128 Jan 23 02:32 renderD128
crw-rw----  1 root video 226, 129 Jan 23 02:32 renderD129

I also get the following error when I attempt to run nvidia-smi, from some reading online, people are telling me that points to a faulty GPU… but if I take the card out of the R720 and put it in my windows 10 machine, it works without issue. I haven’t tried a different slot on the riser card or a different riser card yet (the only other available PCIe slots are only 8x)

$ nvidia-smi 
Unable to determine the device handle for GPU 0000:03:00.0: Unknown Error

Group wise, Plex is a member of video. I wrote it backwards. That part is good.

In this case, Plex will use the CPU’s ASIC by default.

If nvidia’s own software can’t talk to it,

  1. Nvidia package is out of revision or not installed cleanly.
    -or-
  2. The card is defective.

Windows does rendering entirely different so nothing can be carried into Linux.

How can I tell if either of these is accurate? It’s running in an ESXi environment, so spinning up another ubuntu VM to test is no problem at all… I installed/reinstalled drivers about 4-5 times, maybe something got screwed up along the way. Should I be using a specific driver for my card, or ubuntu-drivers autoinstall ?

If the card were defective, I can only assume that it wouldn’t work when I put it in a physical windows host.

Maybe I’ll try assigning it to a windows VM and see if that works.

For any and all that info, I unfortunately have no choice but to refer you elsewhere. (Nvidia Forums?)

  1. I don’t have an Nvidia card here.
  2. I use a NUC
  3. Kinda out of scope to diagnose problems with such things.
  4. Unfortunately, I am swamped otherwise I’d go digging with you to find it.

No problem, thanks for your time. I’ll report back here if I make any progress for others who might run into the same issue.

Hi bmacmillan. I am a complete noob at Linux but after digging around the net, I hobbled together hack that works. I hope the team at Plex can incorporate this into their next build and make it more user friendly.

My setup is an 8th gen Intel NUC. I do run ESXi and Ubuntu as one of the guest VM. I enabled pass-through of the Intel Iris graphics to the Ubuntu server, and it does detect and have the appropriate intel driver for it. I ran into the same exact thing as you in the /dev/dri folder - card0, card1, renderD128, and renderD129.

I think what is going on is card0 and renderD128 refer to the built-in SVGA driver from ESXi for it to run the console. You cannot disable this. The card1 and renderD129 refer to the GPU, in my case, the intel Iris graphics. Plex was insisting on using card0/renderD128.

After digging around for awhile, I used ln command to link to the correct hardware:

ln -sf /dev/dri/renderD129 /dev/dri/renderD128
ln -sf /dev/dri/card1 /dev/dri/card0

This after this, (hw) appeared on both encoding and decoding of the video stream. CPU usage dropped from 100% down to less than 10%.

To make them automatic, I added them to the crontab to start at startup.

@reboot sudo ln -sf /dev/dri/renderD129 /dev/dri/renderD128
@reboot sudo ln -sf /dev/dri/card1 /dev/dri/card0

Give it a shot and see if it will work for you too. And if Plex developers are reading this, please make this as an option which card to use in the software. It would have saved noob like me so much time!

Linux noobs should refrain from giving advice on device entries in /dev without fully understanding how they got there and why they are named as they are.

I don’t know the Preferences.xml option to set to specify the device to use but will ask on Monday.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.