Server Will Not Transcode More Than 2 Downloads Simultaneously

This is for Plex Server Version#: 1.40.0.7998
System: Debian Bookworm
Nvidia Version: 545.23.06
CUDA Version: 12.3

The server has the NVENC and NvFBC patches which remove the 2x stream cap from Nvidia drivers, and they are functional, as demonstrated below.

No matter what is set on ‘Settings → Transcoder’, the server will never transcode more than two downloads. While transcoding, the GPU sits at 39% utilization and the CPU will jump to 50% utilization. It also does not matter if they come from multiple clients or a single client. I can queue up a whole season for my ipad…it will churn through 2 episdoes at a time…or I can have 4x mobile clients do 1x episode of different shows…same thing. 2 episodes at a time max.

Transcoding for streaming is different. There can be 5x streaming clients and all will be receiving HW transcodes and the CPU will be at around 10-15% utilization.

Settings → Transcoder:

Transcoder quality: Automatic
Transcoder temporary directory: /dev/shm
Transcoder default throttle buffer: 120
Background transcoding x264 preset: Medium
Enable HDR tone mapping: Checked
Disable video stream transcoding: Not Checked
Use hardware acceleration when available: Checked
Use hardware-accelerated video encoding: Checked
Hardware transcoding device: GP107GL [Quadro P620]
Maximum simultaneous video transcode: 10

nvidia-smi output with queue of 10 transcodes pending (CPU @ 50% util):

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.06              Driver Version: 545.23.06    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Quadro P620                    Off | 00000000:01:00.0 Off |                  N/A |
| 39%   53C    P0              N/A /  N/A |    311MiB /  2048MiB |     20%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A   1072594      C   ...lib/plexmediaserver/Plex Transcoder      154MiB |
|    0   N/A  N/A   1072597      C   ...lib/plexmediaserver/Plex Transcoder      154MiB |
+---------------------------------------------------------------------------------------+

Verifying that GPU is unlocked and can do more than 2x transcodes:

./patch-tester.sh 
ffmpeg version 5.1.3-1 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 12 (Debian 12.2.0-14)
  configuration: --prefix=/usr --extra-version=1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-
libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libglslang --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --ena
ble-librabbitmq --enable-librist --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-
libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --disable-sndio --enable-libjxl --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-c
hromaprint --enable-frei0r --enable-libx264 --enable-libplacebo --enable-librav1e --enable-shared
  libavutil      57. 28.100 / 57. 28.100
  libavcodec     59. 37.100 / 59. 37.100
  libavformat    59. 27.100 / 59. 27.100
  libavdevice    59.  7.100 / 59.  7.100
  libavfilter     8. 44.100 /  8. 44.100
  libswscale      6.  7.100 /  6.  7.100
  libswresample   4.  7.100 /  4.  7.100
  libpostproc    56.  6.100 / 56.  6.100
-vsync is deprecated. Use -fps_mode
Passing a number to -vsync is deprecated, use a string argument as described in the manual.
Input #0, lavfi, from 'testsrc':
  Duration: N/A, start: 0.000000, bitrate: N/A
  Stream #0:0: Video: rawvideo (RGB[24] / 0x18424752), rgb24, 320x240 [SAR 1:1 DAR 4:3], 25 tbr, 25 tbn
Stream mapping:
  Stream #0:0 -> #0:0 (rawvideo (native) -> h264 (h264_nvenc))
  Stream #0:0 -> #1:0 (rawvideo (native) -> h264 (h264_nvenc))
  Stream #0:0 -> #2:0 (rawvideo (native) -> h264 (h264_nvenc))
  Stream #0:0 -> #3:0 (rawvideo (native) -> h264 (h264_nvenc))
  Stream #0:0 -> #4:0 (rawvideo (native) -> h264 (h264_nvenc))
  Stream #0:0 -> #5:0 (rawvideo (native) -> h264 (h264_nvenc))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
  Metadata:
    encoder         : Lavf59.27.100
  Stream #0:0: Video: h264 (Main), cuda(pc, gbr/unknown/unknown, progressive), 320x240 [SAR 1:1 DAR 4:3], q=2-31, 4000 kb/s, 25 fps, 25 tbn
    Metadata:
      encoder         : Lavc59.37.100 h264_nvenc
    Side data:
      cpb: bitrate max/min/avg: 0/0/4000000 buffer size: 8000000 vbv_delay: N/A
Output #1, null, to 'pipe:':
  Metadata:
    encoder         : Lavf59.27.100
  Stream #1:0: Video: h264 (Main), cuda(pc, gbr/unknown/unknown, progressive), 320x240 [SAR 1:1 DAR 4:3], q=2-31, 1000 kb/s, 25 fps, 25 tbn
    Metadata:
      encoder         : Lavc59.37.100 h264_nvenc
    Side data:
      cpb: bitrate max/min/avg: 0/0/1000000 buffer size: 2000000 vbv_delay: N/A
Output #2, null, to 'pipe:':
  Metadata:
    encoder         : Lavf59.27.100
  Stream #2:0: Video: h264 (Main), cuda(pc, gbr/unknown/unknown, progressive), 320x240 [SAR 1:1 DAR 4:3], q=2-31, 8000 kb/s, 25 fps, 25 tbn
    Metadata:
      encoder         : Lavc59.37.100 h264_nvenc
    Side data:
      cpb: bitrate max/min/avg: 0/0/8000000 buffer size: 16000000 vbv_delay: N/A
Output #3, null, to 'pipe:':
  Metadata:
    encoder         : Lavf59.27.100
  Stream #3:0: Video: h264 (Main), cuda(pc, gbr/unknown/unknown, progressive), 320x240 [SAR 1:1 DAR 4:3], q=2-31, 6000 kb/s, 25 fps, 25 tbn
    Metadata:
      encoder         : Lavc59.37.100 h264_nvenc
    Side data:
      cpb: bitrate max/min/avg: 0/0/6000000 buffer size: 12000000 vbv_delay: N/A
Output #4, null, to 'pipe:':
  Metadata:
    encoder         : Lavf59.27.100
  Stream #4:0: Video: h264 (Main), cuda(pc, gbr/unknown/unknown, progressive), 320x240 [SAR 1:1 DAR 4:3], q=2-31, 5000 kb/s, 25 fps, 25 tbn
    Metadata:
      encoder         : Lavc59.37.100 h264_nvenc
    Side data:
      cpb: bitrate max/min/avg: 0/0/5000000 buffer size: 10000000 vbv_delay: N/A
Output #5, null, to 'pipe:':
  Metadata:
    encoder         : Lavf59.27.100
  Stream #5:0: Video: h264 (Main), cuda(pc, gbr/unknown/unknown, progressive), 320x240 [SAR 1:1 DAR 4:3], q=2-31, 7000 kb/s, 25 fps, 25 tbn
    Metadata:
      encoder         : Lavc59.37.100 h264_nvenc
    Side data:
      cpb: bitrate max/min/avg: 0/0/7000000 buffer size: 14000000 vbv_delay: N/A
frame= 1250 fps=262 q=8.0 Lq=9.0 q=9.0 q=9.0 q=9.0 q=9.0 size=N/A time=00:02:15.52 bitrate=N/A speed=28.4x    
video:12550kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Exiting normally, received signal 2.

I’ve gone through every setting in the admin interface, every .xml file I can find, and even poured over the tables in com.plexapp.plugins.library.db but cannot find where this limit is being imposed. Does anyone have any ideas how to fix this?

Nvidia limits in their driver (below what PMS can interact with)

@ChuckPa: Yes, it is normally limited to 2x streams. However, I have the patch to remove that limit. You can see the test for that under the 3rd section of my post “Verifying that GPU is unlocked and can do more than 2x transcodes:”

As another example, I just fired up three devices watching various movies while also doing downloading of two shows on a fourth:

Wed Mar 13 13:38:59 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.06              Driver Version: 545.23.06    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Quadro P620                    Off | 00000000:01:00.0 Off |                  N/A |
| 40%   55C    P0              N/A /  N/A |   1224MiB /  2048MiB |     41%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A    143120      C   ...lib/plexmediaserver/Plex Transcoder      154MiB |
|    0   N/A  N/A    143147      C   ...lib/plexmediaserver/Plex Transcoder      154MiB |
|    0   N/A  N/A    162578      C   ...lib/plexmediaserver/Plex Transcoder      396MiB |
|    0   N/A  N/A    163022      C   ...lib/plexmediaserver/Plex Transcoder      396MiB |
|    0   N/A  N/A    177164      C   ...lib/plexmediaserver/Plex Transcoder      120MiB |
+---------------------------------------------------------------------------------------+

The 2x limit is only found when doing downloads. I can have unlimited transcodes up to the capacity limit of the card, which is between 8-10x simultaneous transcodes.

@ChuckPa A couple more items of note

  1. My card is a Quadro, which was never limited to only 2x streams
  2. In March 2023, Nvidia raised that limit to 5 for many consumer cards

Any way you slice it, transcoding for downloads is somehow arbitrarily locked to 2x transcodes…and it doesn’t seem like Nvidia is in the way here.

I’ll ask the dumb question.

Did it compile the kmod drivers and rebuild the initramfs after being patched followed by the obvious reboot?

Yes, that is all part of the process. The system did have update-initramfs -u -k all and a reboot.

I have the P2200 with driver version: 535.161.07

In PMS, Settings - Server - Transcoder - Show Advanced

At the bottom: Maximum simultaneous video transcodes: Unlimited

Don’t forget subtitle burning is done the CPU. It will suppress using the GPU if image-based subtitles are in the file.

I just tried what you advised.

Changed from 10 transcodes to Unlimited. Selected a season of a program with no subtitles and tried to download…still only 2x transcodes. I even restarted PMS before trying the download, just for good measure.

Just to clarify…you are able to transcode more than 2x downloads at a time?

If you do a download of a season of a show…how many episodes does it transcode and push at a time?

Most of my users are DirectPlay. Nobody downloads.

I can create a test load of 6 transcode playbacks without issue.

Yeah…I can do 10x transcode playbacks no problem…it is only the downloads where I’m seeing the weird limitation.

If I want to download a whole season of a show it should be transcoding as many episodes as are set in my transcode limit…not just doing 2x at a time. As a result it is 40% as fast as it could be…the card is 60% idle.

Feels like a bug…like the download transcoder is hardcoded to max 2x shows because of previous 2x stream max on cards, and is ignoring “Maximum simultaneous video transcodes” setting.

Feels like a feature limitation, which Engineering confirmed is a limit on ‘static transcodes’ of two.

@ChuckPa - Can you please have someone update the documentation? I’ve spent a lot of effort trying to work this out over a period of a couple years, off and on. I would not have wasted your time or mine if this information was available elsewhere to begin with.

@obumbratum_1

I’m reaching out to the support page folks for that now

Request for enhancement submitted.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.