I’m experiencing a critical and persistent issue where my entire host machine instantly reboots or hard shuts down the moment Hardware Transcoding (QuickSync) is initiated by Plex Media Server (PMS).
I have gone through extensive troubleshooting, including checking all the common Docker/permission issues and trying known kernel workarounds. I am now seeking advice on specific driver/firmware configurations that might cause this instability on Coffee Lake CPUs.
System Configuration
Host OS: Debian 12 (Bookworm)
Kernel Version: 6.1.0-40-amd64
CPU: Intel Core i5-8500T (QuickSync Gen 8, Coffee Lake)
PMS Setup: Running in a Docker container (latest official image).
Plex Pass: Confirmed (Required for hardware transcoding).
The problem
When any stream triggers hardware transcoding (even a very light single stream), the host machine instantly and silently crashes/reboots. This happens with no prior warning or error message.
Troubleshooting performed (crucial steps)
Docker Setup Checked: Confirmed that /dev/dri is correctly mapped into the container, and the Plex user has the necessary permissions (UID/GID) to access renderD128 and card0.
Temperature Checked: Monitored system temperatures using sensors. CPU/iGPU temperatures remain low/normal (50∘C - 60∘C) during the brief period before the crash. Overheating is ruled out.
Kernel Logs Inspected: Checked dmesg and journalctl -b -1 immediately after the crash. There is no trace of a kernel panic, GPU HANG, or any critical i915 error logged before the system reset.
i915 Workaround Failed: Added the kernel argument i915.enable_guc=0 to the GRUB command line (to disable advanced GPU management) and rebooted the host. The issue still occurs.
Question for the community
Has anyone running an i5-8500T (Coffee Lake) on Linux 6.1 experienced this specific type of unlogged, instant machine crash?
Are there any known Plex Transcoder settings or specific Intel driver arguments (modprobe/i915) that could be used to slow down or “soften” the initial power demand of the iGPU to prevent the hard reset?
Any advice on solving this stability issue is greatly appreciated! Thank you.
Better to run in native mode but if you must run docker, change it to host mode networking. You could also switch to plexinc/pms-docker image to be sure you are most compliant.
Honestly if starting a hardware transcode causes a host reboot you have either cpu/memory/power supply issues.
I don’t really have a problem with Plex on Docker, nor with accessing my library remotely.
I don’t use the ‘host’ network because I’ve put Plex in a tunnel to avoid exposing my ports. And my current configuration doesn’t allow me to put it on the ‘host’ network.
I’m going to try to see if I have a CPU/MEM issue, but I highly doubt it because I have a game server on my NUC that loads both quite a bit at the same time when I launch it.
OS issues → kernel panics, usually logged, limited system functions, e.g. only mouse but not menus. Windows would often BSOD.
Software issues → program crashes/freezes and hopefully logged
That’s a new on I’ve not heard before, but I have less knowledge of Linux. I would hope something would log this, but you obviously looked.
Given what you’ve described, maybe power supply, mainboard, or drivers loaded by the OS? Maybe you got a static discharge? Good job checking temps and whatnot.
Have you run memtest86?
Swapping hardware is what we used to try in my department.
If you have a good nose, you might smell burnt components.
Hopefully @ChuckPA sees your topic and can give insight into OS level drivers, HW accel and QSV.
I know you said thermal isn’t an issue, I will still challenge about Dust Bunnies.
Just because the CPU sensors report ok doesn’t mean spikes elsewhere aren’t happening (especially over the QSV)
Power Supply should not be an issue as it’s a 35w CPU (70w power consumption typ) unless a capacitor in the CPU is failing (noisy power)
QSV does put a load on the RAM.
If this machine has been running for several years, I would start with the DIMMs:
blow it out anyway for good measure
Disconnect and reconnect cable connections (they do corrode)
Pull each DIMM and give the edge connectors a wash with isopropyl then put back when dry. (this scrapes off any oxidation which might have built up).
Blow dust out of the DIMM slot while the DIMM is out.
I will not press luck by suggesting anything with the CPU as it’s not warranted at this time.
I would LOVE to say to UPGRADE to the 6.8 kernel and get stable.
(6.1 has definitely had its issues)
For the problem of voltages/currents out of range, I tried adding the arguments i915.enable_dc=0 and intel_idle.max_cstate=1 to the GRUB line as mentioned on this site: linuxreviews.org
After this maneuver, I noticed that I no longer had the possibility to choose the quality of my media
For the problem of the low-level driver, I saw on another site that I needed to add the argument i915.enable_guc=0 to the GRUB line, which I also did. I checked my logs just before the crash with the command: journalctl -b -1 -k | grep -i "panic\|kernel\|error\|crit" which didn’t help me any more than necessary.
In all the logs I was able to look at, I didn’t find anything abnormal, or maybe I skipped a line by accident.
I also thought about doing a memtest86, I need to do it during the day. I’ll keep you posted
I can see all the temperatures of each processor core, isn’t that sufficient?
Regarding the machine, it’s an HP ProDesk 400 G4 that I got about a year ago, which I cleaned and changed the processor’s thermal paste, as well as installing “new” (second-hand) RAM.
Edit : On one of your previous posts dating back to 2020 → forums.plex.tv you mentioned adding an argument to the Preferences.xml file. Do you think that could solve my problem?
Edit 2 : I made a mistake, I upgraded to kernel 6.12, is that bad?
Per your question. Adding VaapiDriver="i965" is no longer viable or necessary. Since that was written, 1) All drivers have been updated 2) the i965 driver has been removed from Plex distribution. It would only be viable if you were running a pre-6.0 Linux kernel.
There’s no reason to set any transcoder settings here. I have four i7-8809G NUC8 boxes (Same intel iGPU component except I have an AMD element added)
“New (second-hand) RAM”
This is a concern only because it’s unknown.
DDR4-2666 is an older clock rate and perfect spec for the 8500.
If the RAM is “used” then you don’t know if it was overclocked, overvoltaged, or overheated. Abusing RAM like this can easily cause the failures you’re seeing because it’s technically ‘damaged’ (burned) inside.
– Download this and put it on a USB stick then boot and run it.
(It’s a standalone memory tester and the best in the business IMHO)
– It has multiple tests it can run, enable everything. The goal is to work the RAM, within operating spec, as hard as possible and see if it faults in any way.
– The testing has a Burn-in mode. If you have time to let it do a burn-in test overnight, do so.
When testing is complete, CAREFULLY review the results.
There should be ZERO faults anywhere.
Question:
All sticks are the same size and speed?
– The total amount of RAM is less or equal 64 GB (CPU maximum) ?
How is the memory instaled (placed)?
– If mixed sizes or specs, are they installed in matching slots in the machine ( A1-A2, B1-B2, etc) so Dual Channel memory will enable ?
I’m back to give an update regarding the arguments in the GRUB command line. It’s best not to remove them, as I experienced instabilities upon rebooting.