HW transcoding isn’t working smoothly with Nvidia GPU

Having the same problem here.
Hardware: HP Proliant DL380p 25sff, 96 GB RAM DDR3/1866, 2 x XEONs E5-2650V2 (tencore)
Video: HP Quadro P2000

Plex server is running in tray (not as service) in a windows 10 vm with Quadro P2000 passed through. The VM has 8 cores, 8 GB ram, a 2GB ramdisk for transcoding and has ~ the same benchmark scores in 3D bench apps like Unigine Superposition as on a bare metal install.

Problem:
Transcoding HEVC 10b HDR (cannot find any non HDR to test) to anything (like 1080p h264) results in ~20-30% cpu usage and about 18-20% GPU usage (both Encode and Decode are showing up in task manager) but buffering every half a minute or so.
Tried with different audio tracks (AC3 5.1, Atmos7.1) no difference.
The card has a displayport dummy plug in it.
Tried with and without subtitles.

UPDATE:
Behaves the same in windows 7

I have bought Plex Pass only for hw transcoding and it is not working.

Isn’t the buffer to hold data that is waiting to be sent?
So then if the server is buffering video doesn’t that imply that it’s processing faster then the player can take it?
If there is activity in task manager showing video decode with Plex transcoder… wouldnt that mean HW is working?
What does it show on the dashboard?
I gave up on GPU transcoding 10 years ago. It didn’t look good to me then and really hasn’t improved that much today compared to CPU transcoding and you have more then enough under the hood for h264 on the fly

The PLAYER îs buffering as in freezing for a few seconds every half a minute.
The transcoding is done thus not real-time but below.
H265 is the problem. I don’t want to assign 32 cores out of a total of 40 for Plex. Not to mention the power consumption.

If This is not solved by the end of the month I’m cancelling my Plex Pass sub

Ok lets step back and think about this for a moment…
Transcoding was explained to me many years ago as a 2 step process

First Step: Decode the video - I picked up a GT 1030 specifically for this plex server, its a 30w card that can decode Hevc - on sale 70 bucks

Second Step Encode the decoded video to a different container - The CPU gets to handle that.

I was still having a problem, just like what you describe, it would send to the TV and fall flat on its face. Buffering, Buffering, Your connection is to slow etc. etc. etc.

my fix for that was a archer C7 router - on sale 60 bucks


The gigabit connection to the router and the dual band made all the difference for me.
The system is a 1700x with win10 pro running off a ssd , I’m not using a ram drive, just plain old enterprise 3tb hard drives 2 for movies, 1 for Tv Shows and 1 for Music.
Take a close look at the picture I posted I was serving up the same 4k HEVC move to 2 different TV’s at the same time, and never missed a beat.
I couldn’t be happier with this system as far as its low power use and its ability to serve up media, my only complaint is the TV Guide, but I’ve managed to install a different one and have it working now

My recommendation to you is, get on the server, fire up a player and run a 4k movie and see what it does, if it falls flat I would think its a decode issue, it it plays fine look at your router setup

Good luck I hope this helps

Thanks for trying to help mate.
I really don’t have any infrastructure problem.
My TVs are for example wired to the network and anyway I was testing Plex client in browser on my workstation, also wired.
YouTube 4K@60fps works fine on my VM, with very little gpu and cpu usage (about 20% each)
The storage is made up by 24 SATA disks of 1TB in RAID60 and a 512 GB cache SSD. My throughput is above 1 GB/s most of the time.
I don’t have *aaany performance issues other than Plex. And as I was saying, Plex is not using the resources. The CPU and gpu usage are too low to keep up the clients well fed with data.

Regards.

Emby works perfectly with 4k remuxes on p2000 but Plex doesn’t. It’s not a hardware issue rather a Plex issue.

I did a test on linux.
I enabled hardware DECODING by renaming the transcoder to Plex Transcoder.bin and making a script in place of the original binary like this

#!/bin/bash
exec /usr/lib/plexmediaserver/Plex\ Transcoder.bin -hwaccel nvdec “$@”

Then I disabled hardware acceleration in UI, so that plex would only decode in hardware.

The result mystifies me as this works on any file I throw at it even though the cpu is busy encoding.

So by using hw transcoding everything buffers and stutters (not to mention transcoding from 4k to 720 or 1080p would simply not work at all) but having only decoding part hardware accelerated everything works perfectly fine (apart from using a LOT of cpu for the encoding part)

This smells a bit like a bad implementation of encoding. Maybe something changed in the drivers and Plex is not using it right ?!

I also tested with older nvidia drivers as far back as 390.xx (with patches for my 4.20.x kernel) no dice.

Here’s something interesting though.
When I enable hw transcoding in UI, the overall CPUs usage is quite low BUT with a thread stuck to the ceiling at 90ish percent all the time. (see the included screenshots)

When I disable it (but leave hw DECODING in place with my above script) the overall CPUs usage is much higher but no one thread is ceiling high. And everything runs smooth.

So this is a case of bad implementation of a producer single thread on which the whole hardware encoding depends.

EDIT:

strace-ing various Transcoder threads, only one stands out with multiple timeouts:

[root@plex-hw ~]# strace -p 4050
strace: Process 4050 attached
restart_syscall(<… resuming interrupted futex …>) = -1 ETIMEDOUT (Connection timed out)
futex(0x132b8b8, FUTEX_WAKE_PRIVATE, 1) = 0
clock_gettime(CLOCK_MONOTONIC_RAW, {tv_sec=5161, tv_nsec=992372362}) = 0
clock_gettime(CLOCK_MONOTONIC_RAW, {tv_sec=5161, tv_nsec=992549062}) = 0
futex(0x7f25ec000b48, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1552294259, tv_nsec=558874000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
futex(0x132b8b8, FUTEX_WAKE_PRIVATE, 1) = 0
clock_gettime(CLOCK_MONOTONIC_RAW, {tv_sec=5162, tv_nsec=493315950}) = 0
clock_gettime(CLOCK_MONOTONIC_RAW, {tv_sec=5162, tv_nsec=493494895}) = 0
futex(0x7f25ec000b48, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1552294260, tv_nsec=59825000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
futex(0x132b8b8, FUTEX_WAKE_PRIVATE, 1) = 0
clock_gettime(CLOCK_MONOTONIC_RAW, {tv_sec=5162, tv_nsec=994079007}) = 0
clock_gettime(CLOCK_MONOTONIC_RAW, {tv_sec=5162, tv_nsec=994199596}) = 0
futex(0x7f25ec000b48, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1552294260, tv_nsec=560481000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
futex(0x132b8b8, FUTEX_WAKE_PRIVATE, 1) = 0
clock_gettime(CLOCK_MONOTONIC_RAW, {tv_sec=5163, tv_nsec=494697849}) = 0
clock_gettime(CLOCK_MONOTONIC_RAW, {tv_sec=5163, tv_nsec=494811847}) = 0
futex(0x7f25ec000b48, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1552294261, tv_nsec=61110000}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)

I don’t know if it means anything though.

PLEX DEVS! HELP!
HW TRANSCODING DISABLED IN UI:

!

HW TRANSCODING ENABLED IN UI:

Ok. If nobody is answering on this, at least notifying us that someone is working on this, I’ll just cancel my Plex Pass and move to emby.
Nobody can say I did not try working this out. Enough with losing time over this.

@dlbogdan I started investigating this today, but have nothing to report yet. It’s not clear if this is a new issue or one that started with PMS 1.15.0 and newer. I see reports of users above reporting Windows 7 vs Windows 10 performance, and also your screenshots of Linux top. I myself have a p2000 GPU which only officially supports HW Encode on Linux. I am aware of a script that forces HWDecode which some users report success with PMS 1.15.0, yet I know that while it may appear to be working, the current ffmpeg version our transcoded is based off does not include all the recent Nvidia decode improvements.

We are working on a update to our transcoder based on a more recent FFMPEG release, which will likely be a forum preview for users to test out when it’s ready.

In the mean time can you give provide any info on when this behaviour your seeing regressed if it is a regression?

1 Like

@chrisallen
Unfortunately I am not in the position to say if this is a regression because I just recently bought both the P2000 and the Plex Pass.
Also I’m not able to download officially any older version of Plex to test if this is indeed a regression.
If you can provide me I’d be happy to test. Someone here mentioned that it last worked perfectly fine sometime in august 2018.
I’m glad someone is actually working on this.
All that I’m certain of is that, as you can see above, only the encoder part of the hardware transcoder has problems. The decoder, even though not officially supported works perfectly fine with what files I’ve tested with (mainly 4k HEVC HDR and 1080p H264 files)
It is also worth mentioning that all-software-transcoding has always worked fine (but with 8-10 cores fully utilized)

1 Like

I installed the release stream version plexmediaserver-1.14.1.5488 and I can confirm the behaviour is the same.

@chrisallen Don’t know if this is related or not but.
When transcoding HEVC 4K with hw encoding enabled and I change the resolution to anything else but the automatic one (Maximum) the transcoding automatically switches to software.
I get this in Plex logs:

Failed to initialise VAAPI connection: -1 (unknown libva error).
Mar 12, 2019 09:44:03.732 [0x7f9bb6ea9700] DEBUG - Codecs: hardware transcoding: opening hw device failed - probably not supported by this system, error: Input/output error

Transcoding h264 works perfectly fine in any resolution HW or SW.

@chrisallen
I can confirm that the hardware accelerated encoder works correctly on
plexmediaserver-1.13.5.5291-6fa5e50a8.x86_64.rpm
with nvidia driver
NVIDIA-Linux-x86_64-390.116

The decoder hack with pushing -hwaccel nvdec will not work though as this ffmpeg is compiled without nvdec.

I’ll try these versions on windows to see if I can get full acceleration going.

Ok. no go with Windows.
Reverted back to linux, latest nvidia drivers, latest plex to do some more debugging.

The problem is the encoder as I was saying, the decoder works fine even though it’s still not officially suported (ironic).

I can more precisely say now what’s going on though.

I was first under the impression that this happens only on 4k files but it happens on any source, h264 or h265, 4k or 1080p. The only difference is that on 1080p source files, the one core that is maxed out is not at 96+ % but 70ish so it kinda works ok.

So this is what happens for example with a 4k source file:
Player is Chrome and the server will automatically transcode by default to 4k h264. which works but with one core on the server maxed out and the others idling mostly.
If I choose a different quality, for example 1080p 12Mbit, or 720… any quality, the transcoding stops entirely with this error in the log:

Mar 14, 2019 21:08:35.521 [0x7f2344ff9700] DEBUG - MDE: Selected protocol dash; container: mp4
Mar 14, 2019 21:08:35.521 [0x7f2344ff9700] DEBUG - MDE: analyzing media item 10
Mar 14, 2019 21:08:35.522 [0x7f2344ff9700] DEBUG - MDE: Life of Pi (2012): Direct Play is disabled
Mar 14, 2019 21:08:35.522 [0x7f2344ff9700] DEBUG - MDE: Life of Pi (2012): media must be transcoded in order to use the dash protocol
Mar 14, 2019 21:08:35.522 [0x7f2344ff9700] DEBUG - MDE: Life of Pi (2012): no direct play video profile exists for http/mkv/hevc
Mar 14, 2019 21:08:35.522 [0x7f2344ff9700] DEBUG - MDE: Life of Pi (2012): no direct play video profile exists for http/mkv/hevc/dca
Mar 14, 2019 21:08:35.522 [0x7f2344ff9700] DEBUG - Life of Pi - video.bitDepth limitation applies: 10 > 8
Mar 14, 2019 21:08:35.522 [0x7f2344ff9700] DEBUG - Life of Pi - audio.channels limitation applies: 6 > 2
Mar 14, 2019 21:08:35.522 [0x7f2344ff9700] DEBUG - MDE: Life of Pi (2012): no remuxable profile found, so video stream will be transcoded
Mar 14, 2019 21:08:35.522 [0x7f2344ff9700] DEBUG - Codecs: testing h264_nvenc (encoder)
Mar 14, 2019 21:08:35.628 [0x7f2308ea5700] DEBUG - [TranscodeOutputStream] Input processing thread exited after writing 692 bytes, m_closed=1, m_endOfFileReached=0, session->isStopped()=1
Mar 14, 2019 21:08:35.628 [0x7f23467fc700] DEBUG - Cleaning directory for session htcw0w1e2mq01gdd3g0oqrx3 (/ramdisk/Transcode/Sessions/plex-transcode-htcw0w1e2mq01gdd3g0oqrx3-bc3edaa9-3661-4dce-97c0-67952bd9ee64)
Mar 14, 2019 21:08:35.628 [0x7f23b0ad4700] DEBUG - Completed: [10.10.0.194:64315] -2 GET /video/:/transcode/universal/subtitles?hasMDE=1&path=%2Flibrary%2Fmetadata%2F10&mediaIndex=0&partIndex=0&protocol=dash&fastSeek=1&directPlay=0&directStream=1&subtitleSize=100&audioBoost=100&location=lan&addDebugOverlay=0&autoAdjustQuality=0&directStreamAudio=1&mediaBufferSize=102400&session=htcw0w1e2mq01gdd3g0oqrx3&subtitles=auto&Accept-Language=ro (17 live) GZIP 64249ms 704 bytes (pipelined: 16)
Mar 14, 2019 21:08:36.117 [0x7f2344ff9700] DEBUG - MDE: Cannot direct stream video stream due to profile or setting limitations
Mar 14, 2019 21:08:36.117 [0x7f2344ff9700] DEBUG - Codecs: hardware transcoding: testing API vaapi
Mar 14, 2019 21:08:36.117 [0x7f2344ff9700] ERROR - [FFMPEG] - libva: va_getDriverName() failed with unknown libva error,driver_name=(null)
Mar 14, 2019 21:08:36.117 [0x7f2344ff9700] ERROR - [FFMPEG] - Failed to initialise VAAPI connection: -1 (unknown libva error).
Mar 14, 2019 21:08:36.117 [0x7f2344ff9700] DEBUG - Codecs: hardware transcoding: opening hw device failed - probably not supported by this system, error: Input/output error
Mar 14, 2019 21:08:36.118 [0x7f2344ff9700] DEBUG - Scaled up video bitrate to 273802Kbps based on 4.500000x fudge factor.
Mar 14, 2019 21:08:36.118 [0x7f2344ff9700] DEBUG - Scaled maximum bitrate for resolution reduction to 96168Kbps.
Mar 14, 2019 21:08:36.118 [0x7f2344ff9700] DEBUG - Life of Pi - audio.channels limitation applies: 6 > 2
Mar 14, 2019 21:08:36.118 [0x7f2344ff9700] DEBUG - MDE: Cannot direct stream audio stream due to profile or setting limitations
Mar 14, 2019 21:08:36.118 [0x7f2344ff9700] DEBUG - MDE: Life of Pi (2012): selected media 0 / 10
Mar 14, 2019 21:08:36.118 [0x7f2344ff9700] DEBUG - Streaming Resource: Reached Decision id=10 codes=(General=1001,Direct play not available; Conversion OK. Direct Play=3000,App cannot direct play this item. Direct play is disabled. Transcode=1001,Direct play not available; Conversion OK.) media=(id=10 part=(id=10 decision=transcode container=mp4 protocol=dash streams=(Video=(id=493 decision=transcode bitrate=9357 encoder=h264_nvenc width=2276 height=1280) Audio=(id=494 decision=transcode bitrate=101 encoder=aac channels=2 rate=48000) Subtitle=(id=497 decision=transcode bitrate=2147483647 encoder=ass languageCode=rum location=sidecar))))
Mar 14, 2019 21:08:36.118 [0x7f23ab7fe700] DEBUG - Killing job.
Mar 14, 2019 21:08:36.118 [0x7f23ab7fe700] DEBUG - Signalling job ID 1857 with 9
Mar 14, 2019 21:08:36.119 [0x7f23ab7fe700] DEBUG - Job was already killed, not killing again.

If I disable the hardware accel in ui and leave the hack in place everything works as it should on any source any resolution, but when using hevc 4k files it’s eating up about 8-10 cores fully on only ONE stream (which is expected for software decoding of 4k h265).

1 Like

Could you provide logs that cover you having HACK + HW Transcoding Enabled, and HACK + HW Transcoding Disabled, for the same file?

We are working on an updated “Plex Transcoder” that may improve the behaviour you are seeing, but we won’t know till we have a forum preview available in the near future.

1 Like

I’ve installed a new vm, with ubuntu 18.04 this time to test things out.
Can’t possibly tell you why but at first glance it seems that now it doesn’t crash anymore when choosing a different resolution. (did you release the update? )
There is still the problem with only one cpu thread being used while transcoding hw (both encoding and decoding) but now it seems that it doesn’t buffer anymore when doing one 4k movie.
I’ll try my old vm with fedora 29 tomorrow and see if there’s a new beta plex build and pulls some logs if it crashes.
I haven’t been paying attention on the last beta versions but I’m guessing this is a new build with some of your work I’m testing on. : 1.15.2.793-782228f99

We are working hard on our updated Plex Transcoder which we hope to have as a forum preview. I’ll be sure to let you know as soon as we have something that you can test.

1 Like

Thank you @chrisallen
This sure sounds conforting.
I’ve made a few logs on 3 different OSes like you asked.
Same file for all logs. (4K HEVC)
All logs have been cleared on start of each test.

The testing begins with transcoding to the client (chrome) the file in automatic mode for about 30 seconds and then changing the resolution to 720p HD (2mbps) for another ~ 30 seconds

Results from my point of view:
On windows it fails to change resolution with HW Transcoding ON, the player image stays black forever.
On fedora 29 it has the same behaviour as on windows but with the added benefit that you can enable decoding separately with the hack which works fine.
On Ubuntu 18.04 the transcoding works fine with HW ON and the HACK even when you change resolution.

All OSes have another thing in common, and that is when fully hardware transcoding (dec+enc) one core goes to 100% usage and the others idles.

Plex Debugging.zip (1.2 MB)

Thank you, I’m waiting too for an update. I can’t help providing data as dlbogdan since I’m a new to hardware transcoding and plex, sorry.

Hey @chrisallen,
Have you got my logs?
Should I open a new thread about this? This problem doesn’t apply only to this version of PMS.