Informal testing results of hardware acceleration

These are super informal, i.e. non-rigorous tests, but I thought it might be of use to someone trying to choose hardware (at least on the lower end). A little background: I’ve been kicking around trying to move my virtual Plex installation back to physical, since hardware acceleration became available in PMS. My current VM has 6, 2.1 GHz cores on it, which is what I found I needed to support 3 1080P streams at one point. I wanted to move to something a little less power consuming, and also free up the resources on my VM host.

I ended up biting the bullet and buying an on-sale SFF Dell that had a Kaby Lake processor in it. I installed a second instance of Plex, and pointed it at a small one of my file shares, and then ran some tests. So for reference, the new system is:

Dell Inspiron 3268 running Win10 Home, 1709
Kaby Lake Core i3-7100@3.9GHz (dual core, 4 thread)
Plex has hardware acceleration enabled

.

I setup to run 3 simultaneous streams. The 3 streams in rough detail:

1080P, DTS; overall rate 4800kbs
1920x800, DTS; overall rate 10.3Mb/s
1080P, DTS; overall rate 19.9Mb/s

Player was through the web client, using Firefox, running on Win10 Pro 1803

I eyeballed the task manager on the Plex server as I ran my tests (I said it was informal)

The first multiple stream test, I set each stream to be 720P@4 Mbit. The taskmanager showed CPU around 30% utilization, and GPU utilization of 30-40%, never reaching above 50%.

Second test I set each to be 1080@8Mbit. The results here were a little concerning. The Plex server CPU pegged to 100%. It more or less kept up, but it was a close thing, there was certainly no excess CPU time available. The GPU was 0%, which I was not happy to see.

I also used the 10.3Mb to do a quick few tests with a single stream only. Results were similar:

1080@8Mbit, 70% CPU/0% GPU
1080@10Mbit, 70% CPU/0% GPU
720@3Mbit, 40%CPU/30% GPU
720@2Mbit, 40%CPU/30% GPU
480@1.5 - was harder to tell, it climbed up to the 40/30 range, but was often below, but GPU was in use.

So, I need to probably start another thread and try to find out why the 1080 test was showing 0% GPU. It certainly seemed like transcoding to 1080P was not using hardware acceleration at all, which wasn’t what I was hoping to see. The 720 results make me still ok with migrating to this hardware, as my people aren’t normally hitting above 720 anyway. I’d still like better 1080 performance though.

Hopefully my informal testing will help someone decide if whatever hardware they are looking a will prove up to serving whatever they are after.

Please include the section <Stream id="1896486" streamType="1" default="1" codec="h264" index="0" bitrate="9575" language="English" languageCode="eng" bitDepth="8" chromaLocation="left" chromaSubsampling="4:2:0" frameRate="23.976" hasScalingMatrix="0" height="808" level="41" profile="high" refFrames="5" requiredBandwidths="26402,19668,17852,15172,13481,12685,11395,11395" scanType="progressive" width="1920" displayTitle="English (H.264 High)"/> from the Plex XML info of your test videos. It provides crucial data to diagnose this.

Also important: keep a browser tab open with a second instance of the web app, on the Now Playing page. It tells you whether hardware support is used. And it does so separately for decoding the source video and encoding to the destination format. Which is also important.

While testing hardware transcoding, it is super-important to not run PMS as a Windows system service.

If the Plex server computer is ‘headless’, you will often need to use a HDMI “fake monitor” dongle. Some graphics adapters switch themselves off when no monitor is present, which then disables hardware transcoding as well.

Server is not headless. I don’t run as a service, no plans to, I’m aware this tends to interfere with things accessing hardware acceleration functions.

For the 3 test videos: 4800kbps (I didn’t realize this one was HEVC, I normally visibly tag those):

<Stream id="323" streamType="1" default="1" codec="hevc" index="0" bitrate="3265" language="English" languageCode="eng" bitDepth="8" chromaSubsampling="4:2:0" colorPrimaries="bt709" colorRange="tv" colorSpace="bt709" colorTrc="bt709" frameRate="23.976" height="800" level="120" profile="main" refFrames="1" width="1920" displayTitle="English (HEVC Main)"/>

10.3Mbit:

<Stream id="370" streamType="1" default="1" codec="h264" index="0" bitrate="8790" bitDepth="8" chromaLocation="left" chromaSubsampling="4:2:0" frameRate="23.976" hasScalingMatrix="0" height="800" level="41" profile="high" refFrames="5" scanType="progressive" width="1920" displayTitle="Unknown (H.264 High)"/>

19.9Mbs

<Stream id="373" streamType="1" default="1" codec="h264" index="0" bitrate="18019" language="English" languageCode="eng" bitDepth="8" chromaLocation="left" chromaSubsampling="4:2:0" frameRate="23.976" hasScalingMatrix="0" height="1080" level="41" profile="high" refFrames="4" scanType="progressive" width="1920" displayTitle="English (H.264 High)"/>

What about the ‘Now Playing’?

Did you run the playing web app on the same machine as the server? If so, this could distort your results, as the performance monitor would also show the decoding load from the web player.

Player is a separate physical machine, connected to the same gigabit network. Media files are also served from a separate machine (VM configured as a file server, multiple spindles, SSD caching tier, it’s reasonably fast). I don’t believe there’s enough network activity to saturate the gigabit interfaces.

I’ve dealt with multiple problems with hardware acceleration on PMS in Windows. I’m at a standstill without any solutions. Here are a few things I’ve found in informal testing:

  1. MPEG2 at 720p never uses hardware acceleration, period. It also won’t use the hardware encoder for h264 in this scenario, it’s all CPU.

  2. Also h265 and h265 decoding will sporadically not use hardware acceleration. The next time you try, it may or may not work. It’s possible you ran into this.

I don’t believe you can see this in the Tautulli logs, so as far as I know, you need to actively watch the Now Playing status to see if hardware is successfully kicked in.

One additional note: Make sure to give the file at least one minute of playback while watching your CPU load. In my experience, Plex will use more CPU at the start of the video (I assume to build up a cache) and then pull back and use less afterward, depending on the complexity of the video.

Part of the difficulty here, for me, is that I don’t have a solid idea of what it should look like. What I was hoping for was largely idle CPU use, with all the work offloaded into the GPU. I base this mostly on the significant changes I remember when h.264 decoding first got offloaded to dedicated hardware in graphics cards many years ago.

That doesn’t seem like the case here, although I can’t see enough of what’s happening either. I suspect the heavy CPU use might actually be either the decode operations(although it shouldn’t be, from what I read) or something like a resize and/or smoothing filter applied between the decode and encode operations. If it’s straight resize though, I guess I’m surprised it would take that much cpu time.

I’m going to try and do some more testing with ffmpeg/handbrake whatever and see if I can get the CPU time down significantly or not, and more work put on the GPU end.

Yeah, I normally see this as well, and then it goes into a spike of use every 20 or 30 seconds, something like that, as it processes the next few minutes at a time or whatever.

Has anyone sent this link?

https://support.plex.tv/articles/115002178853-using-hardware-accelerated-streaming/

Two-thirds down that page is an image of what the Now Playing status should look like when you’re using hardware acceleration. It should show “(hw)” on both decode and encode.

I’ve found that when hardware acceleration is working, you’ll have some CPU usage (depends on CPU, but pretty low after an initial spike) and in recent versions of Windows 10, you can also see that the GPU load is increasing in Task Manager.

Sorry, I don’t mean literally what does it look like in the UI. I mean metrics. One thing that seems to be a fairly common topic that is poorly addressed(as far as I have seen) is how to size the hardware for a given amount of performance, specifically when hardware acceleration is involved. That’s what I was trying to do, put some rough numbers and parts out there that says if you have X you can do Y, so presumably 2X will give you 2Y.

An interesting thing I just noticed. I realized I hadn’t actually checked what Plex thought was happening while playing back, I was just going by inference based on what Windows task manager was showing me.

When I look at “Now Playing” while set to 720P(where the GPU shows activity in the task manager, Plex isn’t showing me any hardware operations happening. However, at the same time, the Task Manager seems to think that the GPU is at least performing the decode operation.

i.e. these two images, one is what Plex says, the other is what Windows says, for the same playback:
plex_status

I’ve migrated my install over to the new hardware now, and tried looking some more testing. The results are super puzzling. While Plex is telling me more often now it’s doing a hardware assisted work, the task manager makes it appear like that’s not always accurate. I’ve tried testing 4 other videos, 2 HEVC and 2 H264. In each easy, if the target resolution is 720P, task manager agrees on the decode usually, and sometimes on the encode.

this one looked liked successful hw decode and encode @720 and @1080 <Stream id="1056133" streamType="1" default="1" codec="hevc" index="0" bitrate="3097" bitDepth="10" chromaSubsampling="4:2:0" colorRange="tv" frameRate="23.976" height="1080" level="120" profile="main 10" refFrames="1" width="1920" displayTitle="Unknown (HEVC Main 10)"/>

this one was always sw decode with hardware encode on indicated on plex. On the Windows side, @720, GPU was getting load, but @1080, GPU looked idle. <Stream id="997371" streamType="1" default="1" codec="h264" index="1" bitrate="12885" bitDepth="10" chromaLocation="left" chromaSubsampling="4:2:0" frameRate="23.976" hasScalingMatrix="0" height="1080" level="41" profile="high 10" refFrames="4" requiredBandwidths="28998,26608,20814,17151,16148,15320,12861,11949" scanType="progressive" width="1920" displayTitle="Unknown (H.264 High 10)"/>

This one showed hw on decode and encode, and appeared to do so @720 and @1080. <Stream id="1057538" streamType="1" default="1" codec="h264" index="0" bitrate="6830" bitDepth="8" chromaLocation="left" chromaSubsampling="4:2:0" colorPrimaries="bt709" colorRange="tv" colorSpace="bt709" colorTrc="bt709" frameRate="23.976" hasScalingMatrix="0" height="1080" level="40" profile="high" refFrames="4" scanType="progressive" width="1920" displayTitle="Unknown (H.264 High)"/>

This one showed hw on decode and encode, and looked like it @720, but again didn’t look like it @1080. CPU use was 80% and GPU looked idle on all the graphs. <Stream id="1056846" streamType="1" default="1" codec="hevc" index="0" bitrate="2283" bitDepth="8" chromaSubsampling="4:2:0" colorRange="tv" frameRate="23.976" height="1080" level="120" profile="main" refFrames="1" width="1920" displayTitle="Unknown (HEVC Main)"/>

I have a guess at the source of the CPU consumption. Does Plex use a software scaler? That’s the only thing I can figure that would be the cause. I’m seeing somewhat similar results doing testing with ffmpeg from the command line. I’m going to see if ffmpeg will let me use a hardware scaler(I think this is possible with Quicksync) and if that causes the CPU utilization to drop, if I can get it to work.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.