Intermittent issues with transcoding on NVIDIA gpu

Server Version#: 1.30.0.6486 (linuxserver/plex docker)
Player Version#: All
Nvidia Driver Version#: 510.108.03
OS: Debian 12

When transcodes are starting, errors start showing up in the logs. Usually after a 10-30 seconds of errors, playback finally starts. But for some clients, it just ends with a playback or session error. Retrying a few times sometime’s works, but some clients are more affected and have a harder time starting playback. Direct stream always works fine.

I have tried with few different servers and different NVIDIA GPU’s and always same behaviour.

Docker is running in privileged mode, so it sees everything from host’s /dev inside the container. Starting container with PUID=0 and PGID=0. All processes inside the container run as root and have as much privilege as is possible to have inside a container.

nvidia-smi works and sees the GPU inside the container. Also shows the transcodes when they manage to start after the 10-30 second delay with errors.

There is no intel GPU or another GPU. So the /dev/dri/renderD128 is the correct device. Adding HardwareDevicePath="/dev/dri/renderD128" to Preferences.xml has no effect.

Any ideas what could cause this, and how to resolve ?

Here are some examples of logs when the issue is occurring:

Jan 02, 2023 08:47:32.591 [0x7ff53eb29b38] ERROR - [Req#6566/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Jan 02, 2023 08:47:33.077 [0x7ff53eb29b38] ERROR - [Req#6566/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Jan 02, 2023 08:47:33.546 [0x7ff53eb29b38] ERROR - [Req#6566/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Jan 02, 2023 08:47:34.026 [0x7ff53eb29b38] ERROR - [Req#6566/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Jan 02, 2023 08:47:34.508 [0x7ff53eb29b38] ERROR - [Req#6566/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Jan 02, 2023 08:47:34.977 [0x7ff53eb29b38] ERROR - [Req#6566/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Jan 02, 2023 08:47:35.459 [0x7ff53eb29b38] ERROR - [Req#6566/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Jan 02, 2023 08:47:36.011 [0x7ff53eb29b38] ERROR - [Req#6566/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Jan 02, 2023 08:47:36.533 [0x7ff53eb29b38] ERROR - [Req#6566/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Jan 02, 2023 08:47:37.039 [0x7ff53eb29b38] ERROR - [Req#6566/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Jan 02, 2023 08:47:37.566 [0x7ff53eb29b38] ERROR - [Req#6566/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Jan 02, 2023 08:47:38.087 [0x7ff53eb29b38] ERROR - [Req#6566/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Jan 02, 2023 07:34:35.273 [0x7ff55ebe2b38] ERROR - [Req#570/Transcode/hdyr99lf3qff5j5zmrnnfnm1/c760839c-e8b8-45e0-995f-f4134032f262] [mp4 @ 0x7f960d023980] failed to rename file chunk-stream0-00008.m4s.tmp to chunk-stream0-00008.m4s: No such file or directory
Jan 02, 2023 07:34:35.273 [0x7ff560317b38] ERROR - [Req#57d/Transcode/hdyr99lf3qff5j5zmrnnfnm1/c760839c-e8b8-45e0-995f-f4134032f262] av_interleaved_write_frame(): No such file or directory
Jan 02, 2023 07:34:35.378 [0x7ff55ebe2b38] ERROR - [Req#580/Transcode/hdyr99lf3qff5j5zmrnnfnm1/c760839c-e8b8-45e0-995f-f4134032f262] [mp4 @ 0x7f960d023d00] failed to rename file chunk-stream1-00008.m4s.tmp to chunk-stream1-00008.m4s: No such file or directory
Jan 02, 2023 07:34:49.010 [0x7ff560d72b38] INFO - [Req#72e/Transcode] CodecManager: starting EAE at "/tmp/pms-a362c36d-0d93-466f-b54a-588d298af179/EasyAudioEncoder"
Jan 02, 2023 07:34:54.573 [0x7ff55c96db38] ERROR - [Req#7fb/Transcode/4e4epov2euux75mhbb1xxfy7/757e6f46-8b55-45de-9db9-ae722d6db1cc] [mp4 @ 0x7f4060b826c0] failed to rename file chunk-stream0-00054.m4s.tmp to chunk-stream0-00054.m4s: No such file or directory
Jan 02, 2023 07:34:54.574 [0x7ff560d72b38] ERROR - [Req#818/Transcode/4e4epov2euux75mhbb1xxfy7/757e6f46-8b55-45de-9db9-ae722d6db1cc] av_interleaved_write_frame(): No such file or directory
Jan 02, 2023 07:34:54.580 [0x7ff55d540b38] ERROR - [Req#81d/Transcode/4e4epov2euux75mhbb1xxfy7/757e6f46-8b55-45de-9db9-ae722d6db1cc] [mp4 @ 0x7f4060b829c0] failed to rename file chunk-stream1-00054.m4s.tmp to chunk-stream1-00054.m4s: No such file or directory
Jan 02, 2023 07:34:56.274 [0x7ff55febcb38] ERROR - [Req#837/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Jan 02, 2023 07:34:56.550 [0x7ff55febcb38] ERROR - [Req#837/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Jan 02, 2023 07:34:56.990 [0x7ff55febcb38] ERROR - [Req#837/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Jan 02, 2023 07:34:57.567 [0x7ff55febcb38] ERROR - [Req#837/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Jan 02, 2023 07:34:58.007 [0x7ff55febcb38] ERROR - [Req#837/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Jan 02, 2023 07:34:58.440 [0x7ff55febcb38] ERROR - [Req#837/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Jan 02, 2023 07:34:58.884 [0x7ff55febcb38] ERROR - [Req#837/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Jan 02, 2023 07:34:59.166 [0x7ff55c96db38] WARN - [Req#8ac/Transcode/oh1tyw2kwa0nxg9osd13q1ir] Transcode runner appears to have died.
Jan 02, 2023 07:35:01.000 [0x7ff55d540b38] ERROR - [Req#855/Transcode/xcbvy251hdmzwlytjwcowrv8/bf01e098-de8b-4d11-ac13-d949ed28954d] [eac3_eae @ 0x7f403d872640] EAE timeout! EAE not running, or wrong folder? Could not read '/tmp/pms-a362c36d-0d93-466f-b54a-588d298af179/EasyAudioEncoder/Convert to WAV (to 8ch or less)/xcbvy251hdmzwlytjwcowrv8_562-0-314.wav'
Jan 02, 2023 07:35:01.001 [0x7ff55febcb38] ERROR - [Req#8f0/Transcode/xcbvy251hdmzwlytjwcowrv8/bf01e098-de8b-4d11-ac13-d949ed28954d] [eac3_eae @ 0x7f403d872640] error reading output: -5 (I/O error)
Jan 02, 2023 07:35:01.001 [0x7ff56137bb38] ERROR - [Req#8f3/Transcode/xcbvy251hdmzwlytjwcowrv8/bf01e098-de8b-4d11-ac13-d949ed28954d] Error while decoding stream #0:1: I/O error
Jan 02, 2023 07:35:01.183 [0x7ff55c96db38] INFO - [Req#8d0] AutoUpdate: no updates available
Jan 02, 2023 07:35:02.000 [0x7ff56137bb38] ERROR - [Req#8ab/Transcode/oh1tyw2kwa0nxg9osd13q1ir/87363d49-9427-45e8-a35c-737cf188c21b] [eac3_eae @ 0x7fc9bfc1d640] EAE timeout! EAE not running, or wrong folder? Could not read '/tmp/pms-a362c36d-0d93-466f-b54a-588d298af179/EasyAudioEncoder/Convert to WAV (to 8ch or less)/oh1tyw2kwa0nxg9osd13q1ir_630-0-21.wav'
Jan 02, 2023 07:35:02.001 [0x7ff560317b38] ERROR - [Req#915/Transcode/oh1tyw2kwa0nxg9osd13q1ir/87363d49-9427-45e8-a35c-737cf188c21b] [eac3_eae @ 0x7fc9bfc1d640] error reading output: -5 (I/O error)
Jan 02, 2023 07:35:02.001 [0x7ff560d72b38] ERROR - [Req#919/Transcode/oh1tyw2kwa0nxg9osd13q1ir/87363d49-9427-45e8-a35c-737cf188c21b] Error while decoding stream #0:1: I/O error
Jan 02, 2023 07:35:06.000 [0x7ff55febcb38] ERROR - [Req#92d/Transcode/xcbvy251hdmzwlytjwcowrv8/bf01e098-de8b-4d11-ac13-d949ed28954d] [eac3_eae @ 0x7f403d872640] EAE timeout! EAE not running, or wrong folder? Could not read '/tmp/pms-a362c36d-0d93-466f-b54a-588d298af179/EasyAudioEncoder/Convert to WAV (to 8ch or less)/xcbvy251hdmzwlytjwcowrv8_562-0-315.wav'
Jan 02, 2023 07:35:06.001 [0x7ff56137bb38] ERROR - [Req#93a/Transcode/xcbvy251hdmzwlytjwcowrv8/bf01e098-de8b-4d11-ac13-d949ed28954d] [eac3_eae @ 0x7f403d872640] error reading output: -5 (I/O error)
Jan 02, 2023 07:35:06.001 [0x7ff560317b38] ERROR - [Req#93d/Transcode/xcbvy251hdmzwlytjwcowrv8/bf01e098-de8b-4d11-ac13-d949ed28954d] Error while decoding stream #0:1: I/O error
Jan 02, 2023 07:35:06.689 [0x7ff55ebe2b38] INFO - [Req#910] AutoUpdate: no updates available
Jan 02, 2023 07:35:08.000 [0x7ff560317b38] ERROR - [Req#935/Transcode/oh1tyw2kwa0nxg9osd13q1ir/87363d49-9427-45e8-a35c-737cf188c21b] [eac3_eae @ 0x7fc9bfc1d640] EAE timeout! EAE not running, or wrong folder? Could not read '/tmp/pms-a362c36d-0d93-466f-b54a-588d298af179/EasyAudioEncoder/Convert to WAV (to 8ch or less)/oh1tyw2kwa0nxg9osd13q1ir_630-0-22.wav'
Jan 02, 2023 07:35:08.001 [0x7ff560d72b38] ERROR - [Req#962/Transcode/oh1tyw2kwa0nxg9osd13q1ir/87363d49-9427-45e8-a35c-737cf188c21b] [eac3_eae @ 0x7fc9bfc1d640] error reading output: -5 (I/O error)
Jan 02, 2023 07:35:08.001 [0x7ff561178b38] ERROR - [Req#966/Transcode/oh1tyw2kwa0nxg9osd13q1ir/87363d49-9427-45e8-a35c-737cf188c21b] Error while decoding stream #0:1: I/O error
Jan 02, 2023 07:35:09.641 [0x7ff56137bb38] ERROR - [Req#983/Transcode/xcbvy251hdmzwlytjwcowrv8/bf01e098-de8b-4d11-ac13-d949ed28954d] [mp4 @ 0x7f403ec2a340] failed to rename file chunk-stream0-00038.m4s.tmp to chunk-stream0-00038.m4s: No such file or directory
Jan 02, 2023 07:35:09.642 [0x7ff560317b38] ERROR - [Req#986/Transcode/xcbvy251hdmzwlytjwcowrv8/bf01e098-de8b-4d11-ac13-d949ed28954d] av_interleaved_write_frame(): No such file or directory
Jan 02, 2023 07:35:09.643 [0x7ff560d72b38] ERROR - [Req#989/Transcode/xcbvy251hdmzwlytjwcowrv8/bf01e098-de8b-4d11-ac13-d949ed28954d] [dash @ 0x7f403ec2a6c0] Unable to open chunk-stream0-00038.m4s.tmp for writing: No such file or directory
Jan 02, 2023 07:35:09.644 [0x7ff561178b38] ERROR - [Req#98b/Transcode/xcbvy251hdmzwlytjwcowrv8/bf01e098-de8b-4d11-ac13-d949ed28954d] [mp4 @ 0x7f403ec2a340] failed to rename file chunk-stream0-00038.m4s.tmp to chunk-stream0-00038.m4s: No such file or directory
Jan 02, 2023 07:35:09.646 [0x7ff55c96db38] ERROR - [Req#98e/Transcode/xcbvy251hdmzwlytjwcowrv8/bf01e098-de8b-4d11-ac13-d949ed28954d] Error writing trailer of dash: No such file or directory
Jan 02, 2023 07:35:11.432 [0x7ff561178b38] ERROR - [Req#99e/Transcode/oh1tyw2kwa0nxg9osd13q1ir/87363d49-9427-45e8-a35c-737cf188c21b] [mp4 @ 0x7fc9c0fd5340] failed to rename file chunk-stream0-00002.m4s.tmp to chunk-stream0-00002.m4s: No such file or directory
Jan 02, 2023 07:35:11.432 [0x7ff55c96db38] ERROR - [Req#9a0/Transcode/oh1tyw2kwa0nxg9osd13q1ir/87363d49-9427-45e8-a35c-737cf188c21b] av_interleaved_write_frame(): No such file or directory
Jan 02, 2023 07:35:11.435 [0x7ff55d540b38] ERROR - [Req#9a4/Transcode/oh1tyw2kwa0nxg9osd13q1ir/87363d49-9427-45e8-a35c-737cf188c21b] [dash @ 0x7fc9c0fd56c0] Unable to open chunk-stream0-00002.m4s.tmp for writing: No such file or directory
Jan 02, 2023 07:35:11.435 [0x7ff55ebe2b38] ERROR - [Req#9a7/Transcode/oh1tyw2kwa0nxg9osd13q1ir/87363d49-9427-45e8-a35c-737cf188c21b] [mp4 @ 0x7fc9c0fd5340] failed to rename file chunk-stream0-00002.m4s.tmp to chunk-stream0-00002.m4s: No such file or directory
Jan 02, 2023 07:35:11.436 [0x7ff55febcb38] ERROR - [Req#9aa/Transcode/oh1tyw2kwa0nxg9osd13q1ir/87363d49-9427-45e8-a35c-737cf188c21b] Error writing trailer of dash: No such file or directory

First up. STOP with the way you’re doing all this; it’s not working and it’s making a mess

You’re using a sledgehammer to fix something which isn’t broken.
It is MISCONFIGURED :slight_smile:

  1. When you have an Intel GPU in the CPU, it becomes renderD128
  2. When you add an Nvidia GPU, it becomes renderD129
  3. This type of error message:
Jan 02, 2023 07:34:49.010 [0x7ff560d72b38] INFO - [Req#72e/Transcode] CodecManager: starting EAE at "/tmp/pms-a362c36d-0d93-466f-b54a-588d298af179/EasyAudioEncoder"
Jan 02, 2023 07:34:54.573 [0x7ff55c96db38] ERROR - [Req#7fb/Transcode/4e4epov2euux75mhbb1xxfy7/757e6f46-8b55-45de-9db9-ae722d6db1cc] [mp4 @ 0x7f4060b826c0] failed to rename file chunk-stream0-00054.m4s.tmp to chunk-stream0-00054.m4s: No such file or directory
Jan 02, 2023 07:34:54.574 [0x7ff560d72b38] ERROR - [Req#818/Transcode/4e4epov2euux75mhbb1xxfy7/757e6f46-8b55-45de-9db9-ae722d6db1cc] av_interleaved_write_frame(): No such file or directory
Jan 02, 2023 07:34:54.580 [0x7ff55d540b38] ERROR - [Req#81d/Transcode/4e4epov2euux75mhbb1xxfy7/757e6f46-8b55-45de-9db9-ae722d6db1cc] [mp4 @ 0x7f4060b829c0] failed to rename file chunk-stream1-00054.m4s.tmp to chunk-stream1-00054.m4s: No such file or directory
Jan 02, 2023 07:34:56.274 [0x7ff55febcb38] ERROR - [Req#837/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Jan 02, 2023 07:34:56.550 [0x7ff55febcb38] ERROR - [Req#837/Transcode] [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).

I ask:

  1. Where is the transcoder temporary directory - local or on a network location?
  2. Where is the /config stored - local device or network location?
  3. How many media items are indexed?

Please do the following:

  1. Verify DEBUG logging is enabled , VERBOSE disabled. SAVE if changed
  2. Restart PMS
  3. Wait 5 full minutes of no activity
  4. Download the logs ZIP file from PMS

Attach here please

  1. The behaviour is always the same, no matter if I transcode to tmpfs/ramdisk, local intel optane, local nvme.
  2. From plex server point of view the /config drive is local, but it’s running in a virtual infrastructure backed by multiple enterprise NVMe drives, 1 million IOPS each.
  3. Around 30-40k items.

The error messages you quoted, might not be related, but I’m seeing those messages when playback is stopped and the transcoding is terminated.

The issues never ever happen when the system is idle, or has low activity.

I have noticed when the issues start occurring, that there are multiple “Plex Media Server” processes hogging the GPU, some of them using more and more GPU memory.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.108.03   Driver Version: 510.108.03   CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA RTX A5000    On   | 00000000:06:10.0 Off |                  Off |
| 30%   50C    P2    73W / 230W |  18823MiB / 24564MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
        
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A    280538      C   ...aserver/Plex Media Server      549MiB |
|    0   N/A  N/A    409228      C   ...aserver/Plex Media Server      213MiB |
|    0   N/A  N/A    515089      C   ...aserver/Plex Media Server       15MiB |
|    0   N/A  N/A    650790      C   ...aserver/Plex Media Server     1542MiB |
|    0   N/A  N/A    811892      C   ...aserver/Plex Media Server      163MiB |
|    0   N/A  N/A    819623      C   ...aserver/Plex Media Server      183MiB |
|    0   N/A  N/A    848945      C   ...aserver/Plex Media Server      201MiB |
|    0   N/A  N/A    865079      C   ...aserver/Plex Media Server      209MiB |
|    0   N/A  N/A    865682      C   ...aserver/Plex Media Server      336MiB |
|    0   N/A  N/A    876097      C   ...aserver/Plex Media Server      169MiB |
|    0   N/A  N/A    879949      C   ...aserver/Plex Media Server      777MiB |
|    0   N/A  N/A    888291      C   ...aserver/Plex Media Server     1576MiB |
|    0   N/A  N/A    890678      C   ...aserver/Plex Media Server     3260MiB |
|    0   N/A  N/A    891919      C   ...aserver/Plex Media Server     2675MiB |
|    0   N/A  N/A    892632      C   ...aserver/Plex Media Server       17MiB |
|    0   N/A  N/A    893457      C   ...aserver/Plex Media Server     4024MiB |
|    0   N/A  N/A    893477      C   ...aserver/Plex Media Server      809MiB |
|    0   N/A  N/A    894611      C   ...aserver/Plex Media Server      943MiB |
|    0   N/A  N/A    895677      C   ...aserver/Plex Media Server      205MiB |
|    0   N/A  N/A   2333941      C   ...aserver/Plex Media Server       20MiB |
|    0   N/A  N/A    898232      C   ...diaserver/Plex Transcoder      250MiB |
+-----------------------------------------------------------------------------+

Which PMS version are you running in the container?

Without seeing logs, there’s little to go on but this:

Jan 02, 2023 07:35:01.000 [0x7ff55d540b38] ERROR - [Req#855/Transcode/xcbvy251hdmzwlytjwcowrv8/bf01e098-de8b-4d11-ac13-d949ed28954d] [eac3_eae @ 0x7f403d872640] EAE timeout! EAE not running, or wrong folder? Could not read '/tmp/pms-a362c36d-0d93-466f-b54a-588d298af179/EasyAudioEncoder/Convert to WAV (to 8ch or less)/xcbvy251hdmzwlytjwcowrv8_562-0-314.wav'
Jan 02, 2023 07:35:01.001 [0x7ff55febcb38] ERROR - [Req#8f0/Transcode/xcbvy251hdmzwlytjwcowrv8/bf01e098-de8b-4d11-ac13-d949ed28954d] [eac3_eae @ 0x7f403d872640] error reading output: -5 (I/O error)
Jan 02, 2023 07:35:01.001 [0x7ff56137bb38] ERROR - [Req#8f3/Transcode/xcbvy251hdmzwlytjwcowrv8/bf01e098-de8b-4d11-ac13-d949ed28954d] Error while decoding stream #0:1: I/O error

Has a few causes:

  1. Transcoder temp directory (Where EAE write) is on a network (non-block i/o) connection and absolute file locks are not supported

  2. inotify table full (max_user_watches) . This is the most common

  3. PMS 1.30.x has a regression with EAE.

Currently I’m using the linuxserver/plex:1.30.0 docker image

Back down to

plexinc/pms-docker:1.29.2.6364-6d72b0cf6

(or LSIO)

and retest

Ok will give that a try, and increase the max_user_watches.

Don’t increase it blindly. Each slot consumes non-paged kernel memory.

Issue persists when running 1.29.2.6364-6d72b0cf6 and previous versions.

The server was already configured for high inotify waches.

fs.inotify.max_queued_events = 16777216
fs.inotify.max_user_instances = 16777216
fs.inotify.max_user_watches = 16777216
user.max_inotify_instances = 16777216
user.max_inotify_watches = 16777216

I have assigned 384GB memory to the server, so not worrying about memory.

The thing that is very consistent, the issue NEVER happens when there’s low/medium load on the server, say only 20-30 transcodes. Also if I see Plex Media Server show up in nvidia-smi. The processes only use a low amount of GPU memory, less than 512MB, and disappear from nvidia-smi soon after.

Then suddenly on higher load, the issues start, and Plex Media Server processes start showing up in nvidia-smi using huge amount of GPU memory, which keeps increasing every time i rerun nvidia-smi. What could cause the Plex Media Server process to consume so much GPU memory ?

The Plex Transcoder processes only ever consumes ~200-400MB GPU memory, weather the issue is occurring or not.

Right now the issue is occurring and I am checking nvidia-smi periodically.

Mon Jan  9 22:30:28 2023
...
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A    357285      C   ...aserver/Plex Media Server     1613MiB |
|    0   N/A  N/A    534934      C   ...aserver/Plex Media Server     2068MiB |
Mon Jan  9 22:30:38 2023       
|    0   N/A  N/A    357285      C   ...aserver/Plex Media Server     1723MiB |
|    0   N/A  N/A    534934      C   ...aserver/Plex Media Server     2739MiB |
Mon Jan  9 22:30:46 2023
|    0   N/A  N/A    357285      C   ...aserver/Plex Media Server     2288MiB |
|    0   N/A  N/A    534934      C   ...aserver/Plex Media Server     3283MiB |
Mon Jan  9 22:31:15 2023
|    0   N/A  N/A    357285      C   ...aserver/Plex Media Server     3731MiB |
|    0   N/A  N/A    534934      C   ...aserver/Plex Media Server     4024MiB |
Mon Jan  9 22:31:49 2023
|    0   N/A  N/A    357285      C   ...aserver/Plex Media Server     5806MiB |
|    0   N/A  N/A    534934      C   ...aserver/Plex Media Server     4756MiB |
Mon Jan  9 22:35:16 2023
|    0   N/A  N/A    357285      C   ...aserver/Plex Media Server     6070MiB |
|    0   N/A  N/A    534934      C   ...aserver/Plex Media Server     9671MiB |

It keeps going, until the GPU has no memory left. And then the logs start flooding with:

Jan 09, 2023 22:34:42.457 [0x7f0eb6c93b38] ERROR - [Req#2c89f/Transcode] [FFMPEG] - cu->cuCtxCreate(&hwctx->cuda_ctx, desired_flags, hwctx->internal->cuda_device) failed
Jan 09, 2023 22:34:42.457 [0x7f0eb6c93b38] ERROR - [Req#2c89f/Transcode] [FFMPEG] -  -> CUDA_ERROR_OUT_OF_MEMORY: out of memory
Jan 09, 2023 22:34:42.457 [0x7f0eb6c93b38] ERROR - [Req#2c89f/Transcode] [FFMPEG] - 
Jan 09, 2023 22:34:44.930 [0x7f0d52f4fb38] ERROR - [Req#2e54a/Transcode] [FFMPEG] - cu->cuCtxCreate(&hwctx->cuda_ctx, desired_flags, hwctx->internal->cuda_device) failed
Jan 09, 2023 22:34:44.930 [0x7f0d52f4fb38] ERROR - [Req#2e54a/Transcode] [FFMPEG] -  -> CUDA_ERROR_OUT_OF_MEMORY: out of memory
Jan 09, 2023 22:34:44.930 [0x7f0d52f4fb38] ERROR - [Req#2e54a/Transcode] [FFMPEG] - 
Jan 09, 2023 22:34:47.208 [0x7f0d97345b38] ERROR - [Req#2e347/Transcode] [FFMPEG] - cu->cuCtxCreate(&hwctx->cuda_ctx, desired_flags, hwctx->internal->cuda_device) failed
Jan 09, 2023 22:34:47.208 [0x7f0d97345b38] ERROR - [Req#2e347/Transcode] [FFMPEG] -  -> CUDA_ERROR_OUT_OF_MEMORY: out of memory
Jan 09, 2023 22:34:47.208 [0x7f0d97345b38] ERROR - [Req#2e347/Transcode] [FFMPEG] - 
Jan 09, 2023 22:34:53.678 [0x7f0d53152b38] ERROR - [Req#2f2ac/Transcode] [FFMPEG] - cu->cuCtxCreate(&hwctx->cuda_ctx, desired_flags, hwctx->internal->cuda_device) failed
Jan 09, 2023 22:34:53.678 [0x7f0d53152b38] ERROR - [Req#2f2ac/Transcode] [FFMPEG] -  -> CUDA_ERROR_OUT_OF_MEMORY: out of memory
Jan 09, 2023 22:34:53.678 [0x7f0d53152b38] ERROR - [Req#2f2ac/Transcode] [FFMPEG] - 

How far into the movie does this occur ?

Can you use dd to cut me that chunk (upload to cloud) that I can reproduce with ?

These are insane. (Each slot needs 540 + Pathname length bytes)

I need this. (adjust fs.inotify.max_user_watches only for your media DIRECTORY count).

[chuck@lizum ~.2002]$ sudo !!
sudo sysctl -a  | grep inotify
fs.inotify.max_queued_events = 16384
fs.inotify.max_user_instances = 128
fs.inotify.max_user_watches = 262144
user.max_inotify_instances = 128
user.max_inotify_watches = 262144
[chuck@lizum ~.2003]$

You’re expecting 30 transcodes of WHAT ? (HEVC HDR) ?
Please share some of these details ; we need do the math

If the inotify slots are using 540 + pathname. Let’s say all pathname are 1K (probably less in reality).

(540+1024) * 16777216= 26239565824

26239565824/1024^3
24.4375

So would use maximum of 24GB to fill all slots if I understand correctly ? Should not be a problem then ? Right now there’s ~330GB free memory.

Also I have Run a partial scan when changes are detected and every other auto library scan settings disabled. I can lower inotify, if you still think it makes a difference ?

All content on all servers should be 720p/1080p h264. I’m 99.9% certain there’s no HEVC or 4K anywhere.

I am not sure the issue is related to some specific content being played. Because I’m getting the exact same behaviour on multiple servers on different hardware with different non-overlapping content.

I just noticed an interesting message that’s flooding quite heavily in Plex Media Scanner Deep Analysis.log while the issue is happening.

Jan 09, 2023 05:04:11.589 [0x7f5bd0d74140] DEBUG - [FFMPEG] - ct_type:0 pic_struct:0
Jan 09, 2023 05:04:11.590 [0x7f5bd0d74140] DEBUG - [FFMPEG] - ct_type:0 pic_struct:0
Jan 09, 2023 05:04:11.590 [0x7f5bd0d74140] DEBUG - [FFMPEG] - ct_type:0 pic_struct:0
Jan 09, 2023 05:04:11.590 [0x7f5bd0d74140] DEBUG - [FFMPEG] - ct_type:0 pic_struct:0
Jan 09, 2023 05:04:11.590 [0x7f5bd0d74140] DEBUG - [FFMPEG] - ct_type:0 pic_struct:0
Jan 09, 2023 05:04:11.590 [0x7f5bd0d74140] DEBUG - [FFMPEG] - ct_type:0 pic_struct:0
# cat Plex\ Media\ Scanner\ Deep\ Analysis.*.log  | grep ct_type | wc -l
381680

Um, actually on second thought. These logs are actually from a different time. So maybe not related.

That’s 24, round up to 25 GB of kernel memory locked for just the inotify.max_user_watches.

Kernel usage increases even higher as you add 16M events (which isn’t how it’s done. There aren’t going to be 16M files modified within 5 seconds, are there? ). Let the kernel manage the queue depth. This isn’t a real-time financial processing system that needs a preset queue.

When deep analysis runs, it generates both an audio and video bitrate profile.
This shows up in the XML. It will show you ‘bitrates’ fields.

This is for when automatic bitrate is enabled at the player

# sysctl -a | grep inotify     
fs.inotify.max_queued_events = 16384
fs.inotify.max_user_instances = 65536
fs.inotify.max_user_watches = 262144
user.max_inotify_instances = 65536
user.max_inotify_watches = 262144

Running with those values now.

GPU RAM issue has not occurred for the last 2 hours. Will notify here next time it happens.

check your PM.

@trex_7

I’m having trouble finding our discussion on this topic in my PMs.

Would you ping me please ?

Thanks.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.