Server Version#: 1.26.0.5715
Player Version#: Plex Web 4.76.1
Docker Version: 20.10.14, build a224086
OS Version: Ubuntu 22.04
Nvidia Driver: 510.60.02
Docker Compose File
services:
plex:
container_name: plex
image: plexinc/pms-docker:latest
restart: unless-stopped
environment:
- TZ=America/Los_Angeles
- PLEX_CLAIM=<removed>
- NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=all
runtime: nvidia
ports:
- 32400:32400
volumes:
- /home/<user>/plex:/config
- /home/<user>/plex/transcode/:/transcode
- /mnt/pool/:/data
Greetings-
I recently installed an RTX 3060 in my Plex server to utilize GPU hardware accelerated encoding. My issue is that it works when I spin up my Docker container, then after some time (most recently within 12 hours) it no longer works. When I first start the container I can exec into it and run nvidia-smi and get the following output (same output as running on host):
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.60.02 Driver Version: 510.60.02 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:43:00.0 On | N/A |
| 0% 36C P8 18W / 170W | 3MiB / 12288MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
but when it breaks I get the following instead when run in the container (host still shows output above):
Failed to initialize NVML: Unknown Error
Poking around my Plex logs I see this repeating log:
May 05, 2022 12:41:03.515 [0x7f77bd4f0b38] ERROR - [Transcode] [FFMPEG] - cu->cuCtxCreate(&hwctx->cuda_ctx, desired_flags, hwctx->internal->cuda_device) failed
May 05, 2022 12:41:03.515 [0x7f77bd4f0b38] ERROR - [Transcode] [FFMPEG] - -> CUDA_ERROR_NOT_PERMITTED: operation not permitted
What I find most strange is that HW encoding works for a little while and then stops working. Seems like it should just be broken and not work at all or it should work completely. Not sure what changes over time. Maybe has to do with DVR or daily butler stuff (throwing stuff against the wall here to see what sticks)?
I am willing to post more logs if that would help.
My current plan is to try swapping over to the linuxserver.io Plex image and see if that works. I have also found some other posts on Reddit and here on how to get HW encoding working with various changes to try to some nvidia files on the host system, some docker compose changes, etc., but usually those changes / fixes are for when HW transcoding doesn’t work at all. In fact, I haven’t found anyone else with this exact issue (works at first then stops working after some time), hence me making this post.
If anyone has any ideas I would very much appreciate any input. Thanks!