Plex - Installed Nvidia Tesla T4

Hi All,

Back again, transcoding is all I post about. Everything else always goes great, promise!

I got a free Nvidia T4 card, and could not pass it up. My plex server was built with an Intel I9, so been doing igpu, after a few posts, dri driver tinkering, hw transcoding worked fine. I installed the T4, and installed the driver for it.

root@Homer:/var/lib/plexmediaserver/Library/Application Support/Plex Media Server/Logs# ubuntu-drivers devices
== /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 ==
modalias : pci:v000010DEd00001EB8sv000010DEsd000012A2bc03sc02i00
vendor : NVIDIA Corporation
model : TU104GL [Tesla T4]
driver : nvidia-driver-450-server - distro non-free
driver : nvidia-driver-510 - distro non-free
driver : nvidia-driver-515 - distro non-free recommended
driver : nvidia-driver-470 - distro non-free
driver : nvidia-driver-418-server - distro non-free
driver : nvidia-driver-470-server - distro non-free
driver : xserver-xorg-video-nouveau - distro free builtin

So i installed the nvidia-driver-515 through apt. I played a video on plex and it is still choosing vaapi.

Sep 05, 2022 19:22:30.945 [0x7f83196bcb00] DEBUG - [Req#13a9/Transcode] TPU: hardware transcoding: using hardware decode accelerator vaapi
Sep 05, 2022 19:22:30.945 [0x7f83196bcb00] DEBUG - [Req#13a9/Transcode] TPU: hardware transcoding: zero-copy support present
Sep 05, 2022 19:22:30.945 [0x7f83196bcb00] DEBUG - [Req#13a9/Transcode] TPU: hardware transcoding: using zero-copy transcoding
Sep 05, 2022 19:22:30.945 [0x7f83196bcb00] DEBUG - [Req#13a9/Transcode] Codecs: hardware transcoding: testing API vaapi
Sep 05, 2022 19:22:30.947 [0x7f83196bcb00] DEBUG - [Req#13a9/Transcode] TPU: hardware transcoding: final decoder: vaapi, final encoder: vaapi

I check the dri folder and linux sees what i assume is the igpu and nvidia (only had one card before).

root@Homer:/var/log# ls -ls /dev/dri
total 0
0 drwxr-xr-x 2 root root 120 Sep 5 18:56 by-path
0 crw-rw----+ 1 root render 226, 0 Sep 5 18:56 card0
0 crw-rw----+ 1 root render 226, 1 Sep 5 18:56 card1
0 crw-rw----+ 1 root render 226, 128 Sep 5 18:56 renderD128
0 crw-rw----+ 1 root render 226, 129 Sep 5 18:56 renderD129

Now i do see these messages in syslog, so maybe the drivers i am using are unhappy

Sep 5 19:22:31 Homer Plex Media Server[3615]: beignet-opencl-icd: no supported GPU found, this is probably the wrong opencl-icd package for this hardware
Sep 5 19:22:31 Homer Plex Media Server[3615]: (If you have multiple ICDs installed and OpenCL works, you can ignore this message)
Sep 5 19:22:31 Homer Plex Media Server[3615]: beignet-opencl-icd: no supported GPU found, this is probably the wrong opencl-icd package for this hardware
Sep 5 19:22:31 Homer Plex Media Server[3615]: (If you have multiple ICDs installed and OpenCL works, you can ignore this message)
Sep 5 19:22:31 Homer Plex Media Server[3615]: beignet-opencl-icd: no supported GPU found, this is probably the wrong opencl-icd package for this hardware
Sep 5 19:22:31 Homer Plex Media Server[3615]: (If you have multiple ICDs installed and OpenCL works, you can ignore this message)

root@Homer:~# dpkg-query -l | grep intel-
ii intel-gmmlib 22.0.0 amd64 Intel(R) Graphics Memory Management Library Package
ii intel-igc-core 1.0.9636 amd64 Intel(R) Graphics Compiler for OpenCL™
ii intel-igc-opencl 1.0.9636 amd64 Intel(R) Graphics Compiler for OpenCL™
ii intel-level-zero-gpu 1.2.22081 amd64 Intel(R) Graphics Compute Runtime for oneAPI Level Zero.
ii intel-media-va-driver:amd64 20.1.1+dfsg1-1 amd64 VAAPI driver for the Intel GEN8+ Graphics family
ii intel-microcode 3.20220510.0ubuntu0.20.04.1 amd64 Processor microcode firmware for Intel CPUs
ii intel-opencl-icd 21.52.22081 amd64 Intel graphics compute runtime for OpenCL

This should correspond to version 21.49.21786, which works with my igpu.

I am wondering if i need to update the runtime version i have for opencl, i know my version is form 2021, however history as shown the newer stuff tends to not like the i9.

Looking for advice, I expected plex to pick the card before the igpu from what i read.

When using the Nvidia GPU.

  1. IGNORE Intel Compute Runtime.

  2. Given you have both the iGPU QSV ASIC and the Nvidia,
    you will have /dev/dri/renderD128 (the Intel) and
    you will have /dev/dri/renderD129 (the Nvidia).

  3. Manually add preference HardwareDevicePath="/dev/dri/renderD129" to Preferences.xml – With Plex STOPPED – and without damaging anything else.

  4. Make certain you have decent Nvidia drivers. The bleeding else one’s do tend to bleed on the floor.

[chuck@lizum NetMount.1810]$ gog nvidia-smi
Mon Sep  5 20:24:07 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.85.02    Driver Version: 510.85.02    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro P2200        On   | 00000000:07:00.0 Off |                  N/A |
| 47%   37C    P8     4W /  75W |      1MiB /  5120MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
[chuck@lizum NetMount.1811]$```

I will give that XML a try, here is the divers i have. Yes the plex has xorg, but the ubuntu server edition had a few of those bleeding problems too.

Mon Sep 5 20:27:16 2022
±----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01 Driver Version: 515.65.01 CUDA Version: 11.7 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 Off | 00000000:01:00.0 Off | Off |
| N/A 63C P8 17W / 70W | 25MiB / 16384MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1338 G /usr/lib/xorg/Xorg 22MiB |
±----------------------------------------------------------------------------+

So i got it working for about 5 minutes transcoding some media. Then the server becomes unresponsive, no pings or etc. Going to pull out a monitor to see what is up. Any thoughts as to what might be doing it?

perhaps nvidia-driver-515 is a bit too bleeding edge and i should drop down to nvidia-driver-510 like you have.

Would need to see the DEBUG logs to even begin to guess.

I would start with backing off the driver a bit. Nvidia does have a habit of bleeding a bit which is why I sit back where I do.

When these stop working , I’ll upgrade to something newer but still back from the edge .

I am reading about cooling issues, and while i have lots of fans in my desktop build, nvidia-sim shows it getting near 84C which is apparently the trip point. For whatever reason, the GPU disappearing makes the entire PC unresponsive and I lose network.

I can 3d print a blower adapter and add the blower to the unit. So going to chase this rabbit to rule it out.

When is the last time you checked it for dust bunnies ?

How many fans / what are their sizes?

Do the math and calculate the CFM being pushed through.

Now look at where that card is. Is it in the airflow where it gets best access to the air and can exhaust easily?

We have to deal with the fluid dynamics (air flow) when we build our own.

I have a NAS tower with 21 drives (12x 12TB HDD and 9 SSDs) with a Xeon E5
and my P2200 doesn’t go above 50C even under full load.

In my build, there are 2x 140mm extracting out the top, 3x 140mm blowing in the front, and 1x 140mm rear vent. The PSU draws in its own air from the bottom and direct vents out the back (typical PSU)

I ma using Fractal’s Torrent case, with the CPU Liquid Cool Addon. So in my case i have the original bottom fans, and the front fans are reduced down a size to 180mm (x3) instead of 2 x 240mm. Torrent — Fractal Design . I also added a rear exhaust for kicks that is not in the original case design.

Overall my case is good according to my sensors:

root@Homer:~# sensors
coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +28.0°C (high = +100.0°C, crit = +100.0°C)
Core 0: +26.0°C (high = +100.0°C, crit = +100.0°C)
Core 1: +27.0°C (high = +100.0°C, crit = +100.0°C)
Core 2: +26.0°C (high = +100.0°C, crit = +100.0°C)
Core 3: +28.0°C (high = +100.0°C, crit = +100.0°C)
Core 4: +27.0°C (high = +100.0°C, crit = +100.0°C)
Core 5: +27.0°C (high = +100.0°C, crit = +100.0°C)
Core 6: +28.0°C (high = +100.0°C, crit = +100.0°C)
Core 7: +26.0°C (high = +100.0°C, crit = +100.0°C)

acpitz-acpi-0
Adapter: ACPI interface
temp1: +27.8°C (crit = +105.0°C)

iwlwifi_1-virtual-0
Adapter: Virtual device
temp1: +34.0°C

nvme-pci-0600
Adapter: PCI adapter
Composite: +40.9°C (low = -273.1°C, high = +84.8°C)
(crit = +84.8°C)
Sensor 1: +40.9°C (low = -273.1°C, high = +65261.8°C)
Sensor 2: +40.9°C (low = -273.1°C, high = +65261.8°C)

But when you look at the T4, I certainly am not pushing air where it wants. Here is an adapter i found NVIDIA Tesla T4/P4 blower fan adapter by aw_ - Thingiverse. I found it explained nicely here: How to avoid an NVIDIA Tesla P4 to overheat - vHojan.nl

It is a server GPU after all, so there is no fan attached to the unit. I just underestimated the need to additional cool it with the fans i already have.

As far as dust goes, i just cleaned it out when i added the t4. I only had dust on the bottom fans despite the mesh screen. The rest of the case was clean after about 5 months of usage. So i was pretty happy with that. My Plex is in the basement, not exactly a clean environment for sure.

I have the Fractal 7 XL case

Do you have Persistence enabled?

What does nvidia-smi show when not transcoding?

This is mine

[chuck@lizum NetMount.1810]$ gog nvidia-smi
Mon Sep  5 20:24:07 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.85.02    Driver Version: 510.85.02    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro P2200        On   | 00000000:07:00.0 Off |                  N/A |
| 47%   37C    P8     4W /  75W |      1MiB /  5120MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
[chuck@lizum NetMount.1811]$

Notice:

  1. P8 state (lowest)
  2. Power 4W
  3. Temp 37C

For now, I had to take the T4 out as it kept crashing the system even when not transcoding.

The last picture i took showed two plex transcodes, and an Xorg in the process list. state P8, temp 83C, power 4W

IF this is your card,

It’s designed for up to 50C.

Whoa on:

That can’t be 4 watts.

4 watts on P8, maybe :slight_smile:

Actually, here is the card without Plex transcode in my ssh history.

Mon Sep  5 22:42:21 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.85.02    Driver Version: 510.85.02    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:01:00.0 Off |                  Off |
| N/A   71C    P8    18W /  70W |     32MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1481      G   /usr/lib/xorg/Xorg                 32MiB |
+-----------------------------------------------------------------------------+

Yep, i corrected my post but you posted faster.

That’s a beautiful case but it’s not cutting it for you with this GPU.

Do you have their highest CFM fans in the front?

Sent you a PM

Noticed you’re running your display on the card as well? I don’t.
The machine runs Ubuntu Server with IPMI for admin

I have to check the fans, it came with the radiator from what i recall.

I don’t use the display and the card has no outputs, but I had issues running ubuntu server and getting /dev/dri to showup and work. The server edition had compatibility issues with my IGPU.
I cut my losses and went with the Desktop version, which worked; otherwise i had to go back further on the LTS version, which was not worth it.

I have an MSI MB and do not have a built-in IPMI/drac kind of option. I wish! I get tired of walking down to the basement to troubleshoot. Next time i will find a server mb with drac.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.