Transcoding is too slow with HDR tonemapping, but no obvious bottleneck exists

zhimsel1 · February 19, 2021, 5:05pm

Server Version#: 1.21.3.4021 (via official plexinc/pms-docker:latest docker image)
Player Version#: Any (Roku Player and Web player all have same issue)
OS: Archlinux (updated as of last week) running Linux 5.10.16
Motherboard: Supermicro X9DRi-F
CPU: Dual Xeon E5-2650 v2 @ 2.60GHz (8 cores each, total of 32 threads)
RAM: 128GB of DDR3 1600Mhz ECC (16x8GB) (mix of 1r and 2r, if I remember correctly, but they’re balanced correctly)

I’ve got a Plex server running on some pretty decent hardware. SDR transcoding works perfectly fine, even 4K HDR > 1080p HDR (I limit my remote stream quality to 12Mbps 1080p). However, with HDR tonemapping on, the same transcode struggles to stay realtime (usually running at 0.9 or so). Dropping to 8Mpbs doesn’t really help.

The weird thing is that the hardware isn’t really being taxed at all. In fact, it appears to use slightly LESS resources than the non-tonemapped transcode. I’ve gone through all the possible bottlenecks I can think of. During either transcode, CPU load is barely hitting 10.0 on a 32-thread system (e.g. roughly 30% usage). The transcoding temp directory is on a RAM disk (tmpfs). The ZFS pool that the original media is on can easily pull over 2 GiB/s sequential reads with more than enough IOPS to deal with reading high bitrate video (it’s currently 9x8TB mirrors). There’s no noticeable spike in iowait.

I’m a little confused. I get that the tonemapping is more CPU intensive, but wouldn’t you then expect to see higher CPU usage, or even getting close to load limits before queuing happens? The server can handle 2+ non-tonemapped transcodes in realtime, so why not even a single tonemapped one?

The only thing I can think of is the transcode dir being in RAM? Is the tonemapping overtaxing the memory bandwidth since it needs to read and write to RAM for the same operation? I can try testing again with the transcode dir on another zpool (single mirror of SATA SSDs).

ChuckPa · February 19, 2021, 5:36pm

There are two things going on.

It’s not going to evenly split across the CPUs as you might think because of how tone mapping works (very single-thread intensive).
8 cores (16 logical) with a total passmark of 9985 means the per-core speed isn’t very fast
PassMark - Intel Xeon E5-2650 v2 @ 2.60GHz - Price performance comparison
There does not appear to be any internet GPU for the tonemapping to be done in hardware (OpenCL) but I could be mistaken here.

What does normal “top” show ? What does “htop” show?
Is any one core being maximized ?

system · May 20, 2021, 5:37pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Software Transcoding Bottleneck Plex Media Server server-linux	13	353	February 23, 2021
HDR Tonemapping strangeness - buffering but server resources not being taxed Plex Media Server server-linux	4	242	February 27, 2021
HDR Tone Mapping CPU Usage Plex Media Server server-linux , server-docker	18	1402	August 9, 2021
Lagging Plex Transcoder using software decoding on HDR tonemapping - WIN Plex Media Server server-windows	3	50	September 27, 2025
HW transcoding + HDR = buffering with low CPU/GPU utilization Desktops & Laptops server-windows	8	256	February 27, 2022

Transcoding is too slow with HDR tonemapping, but no obvious bottleneck exists

Related topics