Nvidia transcoding on linux uses insane amount of memory

Server Version#: 1.18.3.2111
Player Version#: any

Hardware: HP Microserver Gen8
OS: Fedora 31 (x86_64)
RAM: 16GB
GPU: GTX 1050Ti
Driver: 440.36

Everything works great and I can see the card is handling the video transcodes:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.36       Driver Version: 440.36       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 105...  Off  | 00000000:07:00.0 Off |                  N/A |
|  0%   47C    P0    N/A /  75W |    286MiB /  4039MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0    193760      C   /usr/lib/plexmediaserver/Plex Transcoder     273MiB |
+-----------------------------------------------------------------------------+

However when I look at htop… the Plex Transcoder uses up to 14GB ??
193760 plex 20 0 14.0G 463M 286M R 8.5 2.9 6:06.35 /usr/lib/plexmediaserver/Plex Transcoder -codec:0 h264 -hwaccel:0 nvdec -hwaccel_fallback_threshold:0 10 -ss 178 -analyzeduration 20000000 -probesize 20000000 -i /media/data/media/movies/Gemini.Man.2019.1080p.WEB-DL.DD5.1.H264-CMRG/Gemini.Man.2019.1080p.WEB-DL.DD5.1.H264-CMRG.mkv -ss 178 -analyzeduration 20000000 -probesize 20000000 -i /media/data/plexmediaserver/Library/Application Support/Plex Media Server/Cache/Transcode/Sessions/plex-transcode-q74fhc35chfs1fsniv7hmpmm-8da68be3-0d2e-4741-b7bd-a3dfc680d309/temp-0.srt -map_inlineass 1:s:0 -filter_complex [0:0]scale=w=1920:h=1080[0];[0]format=pix_fmts=yuv420p|nv12[1];[1]inlineass=font_scale=1.000000:font_path=/usr/lib/plexmediaserver/Resources/Fonts/DejaVuSans-Regular.ttf:fontconfig_file=/usr/lib/plexmediaserver/Resources/fonts.conf:language=nl[2] -map [2] -codec:0 h264_nvenc -b:0 7294k -maxrate:0 9726k -bufsize:0 19452k -forced-idr:0 1 -r:0 23.975999999999999 -force_key_frames:0 expr:gte(t,178+n_forced*1) -map 0:1 -metadata:s:1 language=eng -codec:1 copy -copypriorss:1 0 -segment_format mpegts -f ssegment -individual_header_trailer 0 -segment_time 1 -segment_start_number 178 -segment_copyts 1 -segment_time_delta 0.0625 -segment_list http://127.0.0.1:32400/video/:/transcode/session/q74fhc35chfs1fsniv7hmpmm/8da68be3-0d2e-4741-b7bd-a3dfc680d309/seglist -segment_list_type csv -segment_list_size 5 -segment_list_separate_stream_times 1 -segment_list_unfinished 1 -max_delay 5000000 -avoid_negative_ts disabled -map_metadata -1 -map_chapters -1 media-%05d.ts -map 1:s:0 -f null -codec ass nullfile -start_at_zero -copyts -y -init_hw_device cuda=cuda: -hwaccel_device cuda -filter_hw_device cuda -nostats -loglevel quiet -loglevel_plex error -progressurl http://127.0.0.1:32400/video/:/transcode/session/q74fhc35chfs1fsniv7hmpmm/8da68be3-0d2e-4741-b7bd-a3dfc680d309/progress

When I try to transcode a second stream this will cause kernel panics like these…

[40809.424195] Plex Media Serv: page allocation failure: order:0, mode:0x10dc0(GFP_KERNEL|__GFP_NORETRY|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0
[40809.424200] CPU: 6 PID: 6506 Comm: Plex Media Serv Tainted: P           OE     5.3.13-300.fc31.x86_64 #1
[40809.424201] Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 04/04/2019
[40809.424202] Call Trace:
[40809.424210]  dump_stack+0x66/0x90
[40809.424212]  warn_alloc.cold+0x7b/0xfb
[40809.424216]  __alloc_pages_slowpath+0xdc4/0xe00
[40809.424219]  __alloc_pages_nodemask+0x2ee/0x340
[40809.424238]  uvm_mem_alloc+0x245/0x3b0 [nvidia_uvm]
[40809.424251]  uvm_va_range_create_semaphore_pool+0x176/0x290 [nvidia_uvm]
[40809.424262]  uvm_api_alloc_semaphore_pool+0xf6/0x1a0 [nvidia_uvm]
[40809.424270]  uvm_ioctl+0xedc/0x1360 [nvidia_uvm]
[40809.424473]  ? _nv008350rm+0x1d/0x30 [nvidia]
[40809.424475]  ? ns_capable_common+0x2e/0x50
[40809.424642]  ? _nv008375rm+0x60/0x80 [nvidia]
[40809.424739]  ? os_is_administrator+0xf/0x20 [nvidia]
[40809.424906]  ? _nv007504rm+0xd0/0x130 [nvidia]
[40809.425025]  ? os_acquire_spinlock+0xe/0x20 [nvidia]
[40809.425263]  ? _nv033270rm+0xc/0x20 [nvidia]
[40809.425423]  ? _nv036742rm+0xac/0x170 [nvidia]
[40809.425426]  ? update_load_avg+0x76/0x600
[40809.425457]  uvm_unlocked_ioctl+0x31/0x60 [nvidia_uvm]
[40809.425471]  uvm_unlocked_ioctl_entry+0x89/0xb0 [nvidia_uvm]
[40809.425475]  do_vfs_ioctl+0x405/0x660
[40809.425478]  ksys_ioctl+0x5e/0x90
[40809.425480]  __x64_sys_ioctl+0x16/0x20
[40809.425484]  do_syscall_64+0x5f/0x1a0
[40809.425488]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[40809.425490] RIP: 0033:0x7ff93ebb234b
[40809.425493] Code: 0f 1e fa 48 8b 05 3d 9b 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0d 9b 0c 00 f7 d8 64 89 01 48
[40809.425494] RSP: 002b:00007ff8857f4ee8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[40809.425496] RAX: ffffffffffffffda RBX: 00007ff87c538f90 RCX: 00007ff93ebb234b
[40809.425497] RDX: 00007ff8857f5270 RSI: 0000000000000044 RDI: 0000000000000060
[40809.425497] RBP: 00007ff8857f5270 R08: 0000000000000001 R09: 00007ff8857f5270
[40809.425498] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000044
[40809.425499] R13: 0000000000000060 R14: 0000000205c00000 R15: 0000000000000000
[40809.425528] Mem-Info:
[40809.425535] active_anon:574979 inactive_anon:15075 isolated_anon:0
                active_file:1138313 inactive_file:1910555 isolated_file:0
                unevictable:0 dirty:9976 writeback:0 unstable:0
                slab_reclaimable:61051 slab_unreclaimable:64858
                mapped:243376 shmem:15263 pagetables:4412 bounce:0
                free:49018 free_pcp:1 free_cma:0
[40809.425539] Node 0 active_anon:2299916kB inactive_anon:60300kB active_file:4553252kB inactive_file:7642220kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:973504kB dirty:39904kB writeback:0kB shmem:61052kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 120832kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[40809.425540] Node 0 DMA free:15884kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15968kB managed:15884kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[40809.425544] lowmem_reserve[]: 0 3313 15915 15915 15915
[40809.425546] Node 0 DMA32 free:64132kB min:14056kB low:17568kB high:21080kB active_anon:247824kB inactive_anon:0kB active_file:1050624kB inactive_file:1918972kB unevictable:0kB writepending:13064kB present:3487632kB managed:3422096kB mlocked:0kB kernel_stack:544kB pagetables:716kB bounce:0kB free_pcp:8kB local_pcp:0kB free_cma:0kB
[40809.425549] lowmem_reserve[]: 0 0 12601 12601 12601
[40809.425551] Node 0 Normal free:116056kB min:116948kB low:130312kB high:143676kB active_anon:2052092kB inactive_anon:60300kB active_file:3502692kB inactive_file:5723636kB unevictable:0kB writepending:26840kB present:13238268kB managed:12912116kB mlocked:0kB kernel_stack:10032kB pagetables:16932kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[40809.425555] lowmem_reserve[]: 0 0 0 0 0
[40809.425557] Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15884kB
[40809.425567] Node 0 DMA32: 69*4kB (UM) 114*8kB (UME) 74*16kB (UME) 61*32kB (ME) 68*64kB (UME) 30*128kB (ME) 17*256kB (UM) 8*512kB (UME) 11*1024kB (UME) 12*2048kB (UM) 2*4096kB (UM) = 64996kB
[40809.425576] Node 0 Normal: 1123*4kB (UMEH) 885*8kB (UMEH) 454*16kB (UMEH) 920*32kB (UME) 884*64kB (UME) 97*128kB (UME) 1*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 117524kB
[40809.425585] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[40809.425585] 3062277 total pagecache pages
[40809.425587] 0 pages in swap cache
[40809.425588] Swap cache stats: add 0, delete 0, find 0/0
[40809.425588] Free swap  = 0kB
[40809.425589] Total swap = 0kB
[40809.425589] 4185467 pages RAM
[40809.425590] 0 pages HighMem/MovableOnly
[40809.425590] 97943 pages reserved
[40809.425591] 0 pages cma reserved
[40809.425591] 0 pages hwpoisoned
[40809.794447] Plex Media Serv: page allocation failure: order:0, mode:0x10dc0(GFP_KERNEL|__GFP_NORETRY|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0
[40809.794452] CPU: 6 PID: 6506 Comm: Plex Media Serv Tainted: P           OE     5.3.13-300.fc31.x86_64 #1
[40809.794453] Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 04/04/2019
[40809.794453] Call Trace:
[40809.794461]  dump_stack+0x66/0x90
[40809.794464]  warn_alloc.cold+0x7b/0xfb
[40809.794468]  __alloc_pages_slowpath+0xdc4/0xe00
[40809.794471]  __alloc_pages_nodemask+0x2ee/0x340
[40809.794492]  uvm_mem_alloc+0x245/0x3b0 [nvidia_uvm]
[40809.794504]  uvm_va_range_create_semaphore_pool+0x176/0x290 [nvidia_uvm]
[40809.794515]  uvm_api_alloc_semaphore_pool+0xf6/0x1a0 [nvidia_uvm]
[40809.794524]  uvm_ioctl+0xedc/0x1360 [nvidia_uvm]
[40809.794733]  ? _nv008350rm+0x1d/0x30 [nvidia]
[40809.794737]  ? ns_capable_common+0x2e/0x50
[40809.794907]  ? _nv008375rm+0x60/0x80 [nvidia]
[40809.795005]  ? os_is_administrator+0xf/0x20 [nvidia]
[40809.795173]  ? _nv007504rm+0xd0/0x130 [nvidia]
[40809.795270]  ? os_acquire_spinlock+0xe/0x20 [nvidia]
[40809.795440]  ? _nv033270rm+0xc/0x20 [nvidia]
[40809.795539]  ? _nv036742rm+0xac/0x170 [nvidia]
[40809.795541]  ? update_load_avg+0x76/0x600
[40809.795552]  uvm_unlocked_ioctl+0x31/0x60 [nvidia_uvm]
[40809.795560]  uvm_unlocked_ioctl_entry+0x89/0xb0 [nvidia_uvm]
[40809.795563]  do_vfs_ioctl+0x405/0x660
[40809.795564]  ksys_ioctl+0x5e/0x90
[40809.795565]  __x64_sys_ioctl+0x16/0x20
[40809.795567]  do_syscall_64+0x5f/0x1a0
[40809.795569]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[40809.795571] RIP: 0033:0x7ff93ebb234b
[40809.795574] Code: 0f 1e fa 48 8b 05 3d 9b 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0d 9b 0c 00 f7 d8 64 89 01 48
[40809.795575] RSP: 002b:00007ff8857f51a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[40809.795576] RAX: ffffffffffffffda RBX: 00007ff87c306670 RCX: 00007ff93ebb234b
[40809.795577] RDX: 00007ff8857f5530 RSI: 0000000000000044 RDI: 0000000000000060
[40809.795578] RBP: 00007ff8857f5530 R08: 0000000000000001 R09: 00007ff8857f5530
[40809.795578] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000044
[40809.795579] R13: 0000000000000060 R14: 0000000205c00000 R15: 0000000000000000

What could be the cause of this?
Or do I need to build a machine with 64GB of ram in order to use hardware transcoding?

That is a device driver bug. As shown in the callback, you can clearly see it’s the nvidia driver itself.

There’s nothing we can do here.

As a verification test, I’d revert to a previous version of the drivers.
Any version, higher than 418.30 is sufficient for PMS use.

Thanks @ChuckPa

I’ve reverted back to: 418.43 (patched with nvidia-patch) and re-enabled hardware transcoding.
I’ll monitor it and hope it doesn’t occur any longer.

Alright, perhaps I wasn’t really looking very well at htop.

VIRT stands for the virtual size of a process, which is the sum of memory it is actually using, memory it has mapped into itself (for instance the video card’s RAM for the X server), files on disk that have been mapped into it (most notably shared libraries), and memory shared with other processes. VIRT represents how much memory the program is able to access at the present moment.

See the attached screenshot.

It goes back to 12GB after stopping the transcode.
Isn’t that still a bit much when Plex is doing nothing?

The value in red is Virtual Ram available to the processes, not what it is actually using. Most of that is cached information that would be discarded if needed by another process.

Here is a historical post where they work through memory consumption.

Using normal top, look at the RSS (Resident Size) not the VSZ (Virtual Size).

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.