I’ve configured the docker settings for Plex but for sanity sake took your advice and spun up the folding docker image and confirmed the GPU is being passed through.
What’s even more infuriating is that I can optimise media using the GPU on Plex, just not live transcoding.
EDIT, to confirm I’m now using:
Plex v1.29.2.6364
Nvidia Drivers v515.86.01 (CUDA 11.x)
Going on the Nvidia website I can download the latest 525 drivers for the P400 so I can only assume it supports CUDA 11 / 12. I’ve also downgraded to 515 and 470, neither of which have fixed my issue using Plex v1.28 / v1.29 / latest but can passthrough the GPU to folding at home and utilise it no problem as seen above.
When I remove all the Nvidia docker parameters and try going back to CPU HW transcoding that is not working either so something is more fundamental must be wrong.
I’ve got a Plex Pass
Enabled “Use hardware acceleration when available”
Docker has write permissions to /transcode and can see session files being created
Tried both RAM and Cache transcoding
Used every combination of PMS / Nvidia driver versions
Deleted the Codec folder and forced a rescan of all media to redownload
Getting very frustrated with it now, the logs offer no assistance either.
I needed time off (burning the candle at both ends). I still need more (bad chest cold). Will do the best I can.
As requested: PMS 1.29.2.6364 RPM
Nvidia 470 drivers will work unless you need AV1 decode
Nvidia 515.86.01 is needed for AV1 decode but you then run the risk of the other faults in the transcoder
Where this is still crazier than me –
I have the 525.60.13 drivers installed on Ubuntu server 20.04.5 LTS
Card is a P2200
All my videos are ripped by me (redbox and library are amazing alternative sources)
I CANNOT REPRODUCE THE PROBLEM.
Explanations –
Nvidia CUDA 12.0 driver bug with your specific card (High)
There are issues with PMS - Nvidia CUDA 12.0 ABI. (High)
You have questionable video file quality (meh…)
(re-encoded by others unknown and busted up… I’ve seen it happen)
You spilled a beverage in the machine and it has a severe hangover
PS: I used 515.86.01 (CUDA 11,7) for the longest time. I switched to 12.0 (525.60.13) only to test, fearing the worst. Surprisingly , for me, it works. That having been said – The machine is a pure server (NAS + PMS ONLY ) console mode – No GUI.
For explanation #1 I am curious as to how this is a CUDA 12.0 driver bug when reverting back to CUDA 11.x still sees the issue occur on PMS 1.3x.x. Also if this were a CUDA bug shouldnt I able to replicate the failure to transcode utilizing HW tanscode with ffmpeg with the same videos that fail with PMS? While I understand there are a lot of variables in play but it seems like blaming the CUDA driver version seems to be pushing the blame to NVIDIA when it seems that the issue is with PMS where CUDA 11.x and 12.0 show the same issue on PMS 1.3x.x
I confirm PMS 1.29.2.6364 RPM is working for me on my “questionable” video file…
But this file has a really low birate and I tried 4k transcoding on Plex web and it really struggle. It works for 20-30s and then start to try buffering (with 1080p transcoding), same file runs fine with transcoding on an iPad.
Also noticed some weird issues after a stall: transcoding service using some CPU and the CIFS process too… weird!
This #2 on my list. It’s number 2 on the list because, working with others on different projects, those other projects also have the same CUDA 12 problems.
Until such time as anyone can point a smoking gun, BOTH PMS AND NVIDIA are suspect.
I’m not going to debate. This is A OR B problem. Flip a coin.
The root cause must be found and reliably repeated by everyone – which has not happened so far even with my controlled testing last week here in the forum
@ChuckPa
Downgraded to the 11.7 CUDA drivers after a kernel update for Fedora just now. Same result 8bit hevc 1080p fails to transcode to h264 1080p. Ive included the last 20 lines of the log file when this happens, can get a bigger segment if needed. Note this is with transcoding the jellyfish file you supplied above.
Files
/srv/ftp/movies/test/jellyfish-30-mbps-hd-hevc.mkv
Media
Duration 0:30
Bitrate 30454 kbps
Width 1920
Height 1080
Aspect Ratio 1.78
Video Resolution 1080p
Container MKV
Video Frame Rate NTSC
Video Profile main
Part
Duration 0:30
File jellyfish-30-mbps-hd-hevc.mkv
Size 109.03 MB
Container MKV
Indexes sd
Video Profile main
Codec HEVC
Bitrate 30454 kbps
Bit Depth 8
Chroma Location left
Chroma Subsampling 4:2:0
Coded Height 1080
Coded Width 1920
Color Range tv
Frame Rate 29.97 fps
Height 1080
Level 4.1
Profile main
Results
~]$ nvidia-smi; tail -20 /var/lib/plexmediaserver/Library/Application\ Support/Plex\ Media\ Server/Logs/Plex\ Media\ Server.log
Mon Feb 6 15:02:27 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.76 Driver Version: 515.76 CUDA Version: 11.7 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A |
| 0% 41C P0 38W / 130W | 119MiB / 6144MiB | 2% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 15464 C ...diaserver/Plex Transcoder 115MiB |
+-----------------------------------------------------------------------------+
Feb 06, 2023 15:02:24.273 [0x7f84f89a0b38] ERROR - [Req#ce4/Transcode/oqp2xi9bbabszonacv35fm8y/95a6c9c2-6b18-4759-b295-3ff347a6da89] Error while decoding stream #0:0: Generic error in an external library
Feb 06, 2023 15:02:24.273 [0x7f84f89a0b38] ERROR - [Req#ce5/Transcode/oqp2xi9bbabszonacv35fm8y/95a6c9c2-6b18-4759-b295-3ff347a6da89] [hevc @ 0x7fbcaa838c80] Could not find ref with POC 893
Feb 06, 2023 15:02:24.274 [0x7f84f89a0b38] ERROR - [Req#ce6/Transcode/oqp2xi9bbabszonacv35fm8y/95a6c9c2-6b18-4759-b295-3ff347a6da89] [hevc @ 0x7fbcaa838c80] No decoder surfaces left
Feb 06, 2023 15:02:24.274 [0x7f84f89a0b38] ERROR - [Req#ce7/Transcode/oqp2xi9bbabszonacv35fm8y/95a6c9c2-6b18-4759-b295-3ff347a6da89] [hevc @ 0x7fbcaa838c80] decoder->cvdl->cuvidDecodePicture(decoder->decoder, &ctx->pic_params) failed -> CUDA_ERROR_INVALID_VALUE: invalid argument
Feb 06, 2023 15:02:24.274 [0x7f84f89a0b38] ERROR - [Req#ce8/Transcode/oqp2xi9bbabszonacv35fm8y/95a6c9c2-6b18-4759-b295-3ff347a6da89] [hevc @ 0x7fbcaa838c80] hardware accelerator failed to decode picture
Feb 06, 2023 15:02:24.274 [0x7f84f89a0b38] ERROR - [Req#ce9/Transcode/oqp2xi9bbabszonacv35fm8y/95a6c9c2-6b18-4759-b295-3ff347a6da89] Error while decoding stream #0:0: Generic error in an external library
Feb 06, 2023 15:02:24.275 [0x7f84f89a0b38] ERROR - [Req#cea/Transcode/oqp2xi9bbabszonacv35fm8y/95a6c9c2-6b18-4759-b295-3ff347a6da89] [hevc @ 0x7fbcaa838c80] No decoder surfaces left
Feb 06, 2023 15:02:24.276 [0x7f84f89a0b38] ERROR - [Req#ceb/Transcode/oqp2xi9bbabszonacv35fm8y/95a6c9c2-6b18-4759-b295-3ff347a6da89] [hevc @ 0x7fbcaa838c80] decoder->cvdl->cuvidDecodePicture(decoder->decoder, &ctx->pic_params) failed -> CUDA_ERROR_INVALID_VALUE: invalid argument
Feb 06, 2023 15:02:24.276 [0x7f84f89a0b38] ERROR - [Req#cec/Transcode/oqp2xi9bbabszonacv35fm8y/95a6c9c2-6b18-4759-b295-3ff347a6da89] [hevc @ 0x7fbcaa838c80] hardware accelerator failed to decode picture
Feb 06, 2023 15:02:24.276 [0x7f84f89a0b38] ERROR - [Req#ced/Transcode/oqp2xi9bbabszonacv35fm8y/95a6c9c2-6b18-4759-b295-3ff347a6da89] Error while decoding stream #0:0: Generic error in an external library
Feb 06, 2023 15:02:24.276 [0x7f84f89a0b38] ERROR - [Req#cee/Transcode/oqp2xi9bbabszonacv35fm8y/95a6c9c2-6b18-4759-b295-3ff347a6da89] [hevc @ 0x7fbcaa838c80] Could not find ref with POC 897
Feb 06, 2023 15:02:24.276 [0x7f84f89a0b38] ERROR - [Req#cef/Transcode/oqp2xi9bbabszonacv35fm8y/95a6c9c2-6b18-4759-b295-3ff347a6da89] [hevc @ 0x7fbcaa838c80] No decoder surfaces left
Feb 06, 2023 15:02:24.277 [0x7f84f89a0b38] ERROR - [Req#cf0/Transcode/oqp2xi9bbabszonacv35fm8y/95a6c9c2-6b18-4759-b295-3ff347a6da89] [hevc @ 0x7fbcaa838c80] decoder->cvdl->cuvidDecodePicture(decoder->decoder, &ctx->pic_params) failed -> CUDA_ERROR_INVALID_VALUE: invalid argument
Feb 06, 2023 15:02:24.277 [0x7f84f89a0b38] ERROR - [Req#cf1/Transcode/oqp2xi9bbabszonacv35fm8y/95a6c9c2-6b18-4759-b295-3ff347a6da89] [hevc @ 0x7fbcaa838c80] hardware accelerator failed to decode picture
Feb 06, 2023 15:02:24.277 [0x7f84f89a0b38] ERROR - [Req#cf2/Transcode/oqp2xi9bbabszonacv35fm8y/95a6c9c2-6b18-4759-b295-3ff347a6da89] Error while decoding stream #0:0: Generic error in an external library
Feb 06, 2023 15:02:24.277 [0x7f84f89a0b38] ERROR - [Req#cf3/Transcode/oqp2xi9bbabszonacv35fm8y/95a6c9c2-6b18-4759-b295-3ff347a6da89] [hevc @ 0x7fbcaa838c80] No decoder surfaces left
Feb 06, 2023 15:02:24.277 [0x7f84f89a0b38] ERROR - [Req#cf4/Transcode/oqp2xi9bbabszonacv35fm8y/95a6c9c2-6b18-4759-b295-3ff347a6da89] [hevc @ 0x7fbcaa838c80] decoder->cvdl->cuvidDecodePicture(decoder->decoder, &ctx->pic_params) failed -> CUDA_ERROR_INVALID_VALUE: invalid argument
Feb 06, 2023 15:02:24.277 [0x7f84f89a0b38] ERROR - [Req#cf5/Transcode/oqp2xi9bbabszonacv35fm8y/95a6c9c2-6b18-4759-b295-3ff347a6da89] [hevc @ 0x7fbcaa838c80] hardware accelerator failed to decode picture
Feb 06, 2023 15:02:24.278 [0x7f84f89a0b38] ERROR - [Req#cf6/Transcode/oqp2xi9bbabszonacv35fm8y/95a6c9c2-6b18-4759-b295-3ff347a6da89] Error while decoding stream #0:0: Generic error in an external library
Feb 06, 2023 15:02:27.259 [0x7f84f89a0b38] WARN - [Req#d21/Transcode] Got a transcode session ping without a valid session ID.