Server Version#: 1.25.1.5286
Player Version#: Using Chrome (96.0.4664.45) Web version 4.71.0
HW: Intel 12600k
I am using the official docker container (“plexpass” variant) on Ubuntu 20.04. In addition I am running a custom build of kernel 5.16.0-rc2-custom to enable full support for the CPU and peripherals. DRI nodes (/dev/dri) are mapped into the container.
HW transcoding SDR 4k content works beautifully but when I try to play back any HDR content it fails and I see the following in the kernel log (dmesg):
[ 2794.709032] i915 0000:00:02.0: [drm] Resetting vcs0 for preemption time out
[ 2794.711329] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:4:28fffffd, in Plex Transcoder [51622]
[ 2803.076590] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:4:28fffffd, in Plex Transcoder [51622]
[ 2803.077611] i915 0000:00:02.0: [drm] Resetting vcs0 for stopped heartbeat on vcs0
[ 2803.078157] i915 0000:00:02.0: [drm] Resetting chip for stopped heartbeat on vcs0
[ 2803.180055] [drm:__uc_sanitize [i915]] *ERROR* Failed to reset GuC, ret = -110
[ 2803.273118] i915 0000:00:02.0: [drm] *ERROR* Failed to reset chip
[ 2803.273132] i915 0000:00:02.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by intel_gt_reset+0x24f/0x2c0 [i915]
[ 2803.376059] [drm:__uc_sanitize [i915]] *ERROR* Failed to reset GuC, ret = -110
[ 2803.377916] i915 0000:00:02.0: [drm] Plex Transcoder[51622] context reset due to GPU hang
[ 2803.520212] show_signal_msg: 42 callbacks suppressed
[ 2803.520216] Plex Media Serv[51097]: segfault at 0 ip 0000000000000000 sp 00007f5355652788 error 14
[ 2803.520221] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
[ 2807.637534] Fence expiration time out i915-0000:00:02.0:Plex Transcoder[51622]:21ac!
[ 3254.147304] Attempt to set a LOCK_MAND lock via flock(2). This support has been removed and the request ignored.
[ 3955.711097] Plex Media Serv[64313]: segfault at 0 ip 0000000000000000 sp 00007f3f7d339018 error 14
[ 3955.711103] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
[ 3963.976659] Plex Media Serv[64496]: segfault at 0 ip 0000000000000000 sp 00007effbcab6018 error 14
[ 3963.976666] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
[ 3972.014925] Plex Media Serv[64783]: segfault at 0 ip 0000000000000000 sp 00007f1d6e5d7e28 error 14
[ 3972.014946] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
Since this only happens with HDR content I wonder if this has something to do with the tone mapping, perhaps using OpenCL?
I also installed PMS without docker and upgraded all the Intel graphics/dri/drm/OCL stack from their repo to see if these versions behave differently. So far no difference.
Can you confirm if this is specific to tone mapping, or if it applies to all 10-bit HEVC content?
These errors largely look like kernel issues. This sometimes happens with very new chips; the driver support generally don’t stabilize until a bit after release. You’ll probably want to report to the upstream driver devs.
What I interpret as what Ridley means, if you turn off tone mapping in PMS, and then try transcoding, do you get the same errors, different errors, or does hw work when tone mapping is off ?
Thank you for the clarification @TeknoJunky ! It seems that when I disable the tone mapping, the HW transcode works Ok but as expected, some content looks washed out or otherwise strange.
@Ridley can you elaborate on this so I can write a bug report upstream. Is OpenCL used here?
Not quite what I meant (some 10-bit content isn’t HDR at all), but it’s an equivalent test regardless
Yes, tone mapping in Plex on Intel GPUs is done via OpenCL (i.e. the Intel Compute Runtime driver). It’s possible that you might have a Plex-specific issue in addition to whatever’s causing those kernel errors; I might be able to get a bit more information by looking over your server logs.
Dec 02, 2021 19:43:04.762 [0x7f697d648b38] ERROR - [Transcoder] [AVHWDeviceContext @ 0x7f3bcef27d40] No matching devices found.
Looks like we’re failing to locate an OpenCL device corresponding to your GPU. This could be due to a CL driver issue. Try running this on the command line for some additional diagnostics:
[AVHWDeviceContext @ 0x7f01a2080ec0] 2 OpenCL platforms found.
[AVHWDeviceContext @ 0x7f01a2080ec0] platform_version does not match ("OpenCL 3.0 ").
beignet-opencl-icd: no supported GPU found, this is probably the wrong opencl-icd package for this hardware
(If you have multiple ICDs installed and OpenCL works, you can ignore this message)
[AVHWDeviceContext @ 0x7f01a2080ec0] No devices found on platform "Intel Gen OCL Driver".
[AVHWDeviceContext @ 0x7f01a2080ec0] No matching devices found.
[AVHWDeviceContext @ 0x7f01a2080ec0] 2 OpenCL platforms found.
[AVHWDeviceContext @ 0x7f01a2080ec0] 0.0: Intel(R) OpenCL HD Graphics / Intel(R) Graphics [0x4680]
[AVHWDeviceContext @ 0x7f01a2080ec0] Platform Intel Gen OCL Driver does not export the VAAPI device enumeration function.
[AVHWDeviceContext @ 0x7f01a2080ec0] Beignet DRM to OpenCL image mapping function not found (clCreateImageFromFdINTEL).
[AVHWDeviceContext @ 0x7f01a2080ec0] Beignet DRM to OpenCL mapping not usable.
[AVHWDeviceContext @ 0x7f01a2080ec0] cl_intel_va_api_media_sharing found as platform extension.
[AVHWDeviceContext @ 0x7f01a2080ec0] Intel QSV to OpenCL mapping function found (clCreateFromVA_APIMediaSurfaceINTEL).
[AVHWDeviceContext @ 0x7f01a2080ec0] Intel QSV in OpenCL acquire function found (clEnqueueAcquireVA_APIMediaSurfacesINTEL).
[AVHWDeviceContext @ 0x7f01a2080ec0] Intel QSV in OpenCL release function found (clEnqueueReleaseVA_APIMediaSurfacesINTEL).
I went off the beaten path a bit and pulled in 5.15 kernel with mixed results. I can get Plex to transcode a 4K HDR 10bit file (tone mapping enabled) but it will end up with crazy artifacts (green/white) squares across frames.
attached my full logs for running @Ridley diagnostic test (looks successful?)
@Ridley did you ever get a chance to circle back on this? I went and got an Nvidia P4 card so I can do HW stable transcoding but basically wasted buying the new 12th Gen Intel chip until this works.