Hardware Accelerated Decode (Nvidia) for Linux

The GTX 1160 is a nice little card. I can get 5+ - 4k HVEC transcodes + 1 from the cpu (4790k). Quality set to very fast is MILES ahead of pascal and only slightly behind quick sync. Its a nice middle ground and seeing how most of my transcode users are on phones, I don’t think they’re going to notice such small minor blocking and banding on a screen that size.

I just need tone mapping for HDR --> SDR and I’ll be set!

The GTX 1160 is a nice little card. I can get 5+ - 4k HVEC transcodes + 1 from the cpu (4790k). Quality set to very fast is MILES ahead of pascal and only slightly behind quick sync. Its a nice middle ground and seeing how most of my transcode users are on phones, I don’t think they’re going to notice such small minor blocking and banding on a screen that size.

I thought the quality settings (ie. very fast) don’t apply to HW transcoding?

The quality settings in Plex do not apply to the hardware transcodes, but there are quality settings that can be tuned in FFMPEG:

root@plex#/usr/lib/plexmediaserver/Plex\ Transcoder2 -h encoder=h264_nvenc

Encoder h264_nvenc [NVIDIA NVENC H.264 encoder]:
    General capabilities: delay
    Threading capabilities: none
    Supported pixel formats: yuv420p nv12 p010le yuv444p yuv444p16le bgr0 rgb0 cuda
h264_nvenc AVOptions:
  -preset            <int>        E..V.... Set the encoding preset (from 0 to 11) (default medium)
     default                      E..V....
     slow                         E..V.... hq 2 passes
     medium                       E..V.... hq 1 pass
     fast                         E..V.... hp 1 pass
     hp                           E..V....
     hq                           E..V....
     bd                           E..V....
     ll                           E..V.... low latency
     llhq                         E..V.... low latency hq
     llhp                         E..V.... low latency hp
     lossless                     E..V....
     losslesshp                   E..V....
  -profile           <int>        E..V.... Set the encoding profile (from 0 to 3) (default main)
     baseline                     E..V....
     main                         E..V....
     high                         E..V....
     high444p                     E..V....
  -level             <int>        E..V.... Set the encoding level restriction (from 0 to 51) (default auto)
     auto                         E..V....
     1                            E..V....
     1.0                          E..V....
     1b                           E..V....
     1.0b                         E..V....
     1.1                          E..V....
     1.2                          E..V....
     1.3                          E..V....
     2                            E..V....
     2.0                          E..V....
     2.1                          E..V....
     2.2                          E..V....
     3                            E..V....
     3.0                          E..V....
     3.1                          E..V....
     3.2                          E..V....
     4                            E..V....
     4.0                          E..V....
     4.1                          E..V....
     4.2                          E..V....
     5                            E..V....
     5.0                          E..V....
     5.1                          E..V....
  -rc                <int>        E..V.... Override the preset rate-control (from -1 to INT_MAX) (default -1)
     constqp                      E..V.... Constant QP mode
     vbr                          E..V.... Variable bitrate mode
     cbr                          E..V.... Constant bitrate mode
     vbr_minqp                    E..V.... Variable bitrate mode with MinQP (deprecated)
     ll_2pass_quality              E..V.... Multi-pass optimized for image quality (deprecated)
     ll_2pass_size                E..V.... Multi-pass optimized for constant frame size (deprecated)
     vbr_2pass                    E..V.... Multi-pass variable bitrate mode (deprecated)
     cbr_ld_hq                    E..V.... Constant bitrate low delay high quality mode
     cbr_hq                       E..V.... Constant bitrate high quality mode
     vbr_hq                       E..V.... Variable bitrate high quality mode
  -rc-lookahead      <int>        E..V.... Number of frames to look ahead for rate-control (from 0 to INT_MAX) (default 0)
  -surfaces          <int>        E..V.... Number of concurrent surfaces (from 0 to 64) (default 0)
  -cbr               <boolean>    E..V.... Use cbr encoding mode (default false)
  -2pass             <boolean>    E..V.... Use 2pass encoding mode (default auto)
  -gpu               <int>        E..V.... Selects which NVENC capable GPU to use. First GPU is 0, second is 1, and so on. (from -2 to INT_MAX) (default any)
     any                          E..V.... Pick the first device available
     list                         E..V.... List the available devices
  -delay             <int>        E..V.... Delay frame output by the given amount of frames (from 0 to INT_MAX) (default INT_MAX)
  -no-scenecut       <boolean>    E..V.... When lookahead is enabled, set this to 1 to disable adaptive I-frame insertion at scene cuts (default false)
  -forced-idr        <boolean>    E..V.... If forcing keyframes, force them as IDR frames. (default false)
  -b_adapt           <boolean>    E..V.... When lookahead is enabled, set this to 0 to disable adaptive B-frame decision (default true)
  -spatial-aq        <boolean>    E..V.... set to 1 to enable Spatial AQ (default false)
  -temporal-aq       <boolean>    E..V.... set to 1 to enable Temporal AQ (default false)
  -zerolatency       <boolean>    E..V.... Set 1 to indicate zero latency operation (no reordering delay) (default false)
  -nonref_p          <boolean>    E..V.... Set this to 1 to enable automatic insertion of non-reference P-frames (default false)
  -strict_gop        <boolean>    E..V.... Set 1 to minimize GOP-to-GOP rate fluctuations (default false)
  -aq-strength       <int>        E..V.... When Spatial AQ is enabled, this field is used to specify AQ strength. AQ strength scale is from 1 (low) - 15 (aggressive) (from 1 to 15) (default 8)
  -cq                <float>      E..V.... Set target quality level (0 to 51, 0 means automatic) for constant quality mode in VBR rate control (from 0 to 51) (default 0)
  -aud               <boolean>    E..V.... Use access unit delimiters (default false)
  -bluray-compat     <boolean>    E..V.... Bluray compatibility workarounds (default false)
  -init_qpP          <int>        E..V.... Initial QP value for P frame (from -1 to 51) (default -1)
  -init_qpB          <int>        E..V.... Initial QP value for B frame (from -1 to 51) (default -1)
  -init_qpI          <int>        E..V.... Initial QP value for I frame (from -1 to 51) (default -1)
  -qp                <int>        E..V.... Constant quantization parameter rate control method (from -1 to 51) (default -1)
  -weighted_pred     <int>        E..V.... Set 1 to enable weighted prediction (from 0 to 1) (default 0)
  -coder             <int>        E..V.... Coder type (from -1 to 2) (default default)
     default                      E..V....
     auto                         E..V....
     cabac                        E..V....
     cavlc                        E..V....
     ac                           E..V....
     vlc                          E..V....

I’ve been using the transcoder calls to create various test cases to figure out the best performance for 4K media.

One big takeaway I’ve found is that burning in subtitles is a big problem, and was the common factor as to why my transcodes were running at 0.5x speeds.

A standard 4K HVEC 10bit source will transcode just fine down to 1080p so long as subtitles are disabled. The 7.1 TRUEHD audio and 5.1 AC3 audio are not a bottleneck here with testing either.

2 Likes

The quality settings in Plex do not apply to the hardware transcodes, but there are quality settings that can be tuned in FFMPEG

Oh wow, I did not know you could manually tune that. That would have to be done after each Plex update though correct?

On a side note, I’m wondering if it’s worth me trading in my GXT 1080Ti for something like a Quadro RTX 4000.

what exactly is the point of tuning quality for something that is not going to be saved?

Techno, it’s more understanding capabilities and what can be done with it. When having plex do optimizations these options can become more important, but agreed are less important with live transcodes.

It would be nice to see the ability to add in 10bit color support for H264 for those devices that support HDR but are not on the local LAN. The current version of FFMPEG with Plex does not appear to be one that supports those color depths. On the same thread, H265 support would be useful as well to improve quality of remote streams for devices that support it, but the h265_nvenc encoder is not enabled on the build Plex uses.

1 Like

I don’t think Nvidia has announced a Quadro RTX card that has two encoder chips like the 1080TI, so that will be a key consideration.

It directly impact the quality of the viewer experience, what else could matter? I t is not like we are keeping archival copies of this material for historical purposes here :slight_smile: all that matters is the quality of the user experience.

if anyone is worried about viewer experience, then they/you should not even be transcoding in the first place.

I understand the main solution everyone is clamoring for is a single HQ content that can be scaled down to whatever the client/connection can handle.

but even when NVDEC gets released to the masses, there are still issues with subtitles and of course HDR > SDR

it all comes back to, for best results your content should be appropriate to your clients.

this has been true for years, plex even has a built in function for this, its called optimized files.

everyone just wants to throw more hardware at an issue that is psychological.

User expectations vary so much! I share my plex with some that don’t even bother to change their client configuration from the default 720p 2/4 Mbps rate. Then you have the folks that will take screen caps, magnify and compare, and use that to claim “unacceptable quality differences.” My goals are in the middle. for any transcoding situation. I am 100% satisfied with what FFMPEG (using software/NVENC/ or iGPU) will provide when working correctly, and folks that want better than that it’s up to them to made sure to have clients that will stream directly. The push from sw to hw is just about enabling more higher quality simultaneous streams.

2 Likes

Is there anyone here with a similar setup where they have decently powerful CPU together with a Nvidia GPU? I’m quite happy with my current solution, where I only let the GPU do the HEVC decode (mainly to save memory space). and by design, when the checkmark for “HW transcoding” is enabled, all encodes are done by the GPU.

Now, after Plex will inevitable enable HW decode on Nvidia GPU’s native, Can I still use a bash script to force H264 transcodes on the CPU and HEVC on the GPU? I’m not sure what “flags” Plex uses for it’s encoder to forward something to the GPU and/or CPU.

I would like to know if this will be possible after the release.

That would be possible if using a wrapper script to edit the attributes passed to the transcoder. Very similar to what is being done now to add the support, the same thing can be done to remove the nvdec enablement.

This is something I’m intending to work into the patching script so users can tune their options of what to leverage nvdec for versus not.

Sounds very interesting. That way we could load-balance ourselves. Something I’m planning on adding, is time-dependent (or even load-dependent) trasnscoder settings. This way I can use the limited (but energy efficient) resources of the GPU during low-load scenario’s, but let the CPU kick in during high times, when more resources are required. All these are nice advanced options that only apply for a few of us. Still nice to have :slight_smile:

If you mean load by number of streams, you can already do that by not using the patch if you’re using a non unlimited stream card. It would use your GPU for the first 2 streams then the cpu after that.

Unlimited is not really unlimited though. You are still limited tto memory usage or encoder/decoder speeds. So I meant load balancing where I put like 8 HEVC streams on the gpu where it will be close to max usage so the rest will be for the CPU.

So a year later. I finally read through all of y’all’s comments. Great conversations. But now the real question no one has asked in a while. Does PMS on Linux support HW decoding out of the box without doing a fancy patch? I’d rather not mess with my system more than I need to. Wife got mad at me the last time I took our server down tinkering.

Intel Graphics: Yes Nvidia:No

If you truly read the entire thread and don’t understand, this isn’t the right thread for you and you should wait.

2 Likes

Not out of the box yet, but can be doen with a minor fix. See this thread here: Guide: NVDEC Hardware Acceleration Patch for Plex Media Server on Linux

1 Like

@ChuckPA Given there have been hacky ways found to enable HW decoding, its obviously possible currently, but are you aware and able to tell us a reason it wasn’t a part of the update that made it possible to enable with our wrapper script? I realize you have no road-maps or anything solid right now, I’m just curious as to the reason. I assume its QA processes and whatnot, but I figure why not ask. If not, no worries man. Thanks!