Convert incompatible (text based) subtitles instead of transcoding
I want to avoid transcoding as much as possible, so if some form of text based subtitle format is not supported on the client (in my case WebVTT), the server should make an effort to convert it (to another, supported text based subtitle format). Generally, every client should understand the basic SRT format.
And I know that the advanced subtitle formats (ASS/WebVTT) don’t map perfectly into SRT, so the converter would have to make some compromises, i.e. dropping most of the styling, but this would still be preferable over transcoding (to me).
And if the client supports at least one of the advanced subtitle formats, you could (probably) completely emulate any (client) unsupported format.
Ideally, this should be a server setting though. I know that some clients, like Amazon’s Fire TV, have a subtitle transcode setting, which allows you to choose between Automatic and Graphics subtitles only transcoding, but my client doesn’t offer such a setting. Furthermore, I completely disabled video transcoding on my Raspberry Pi server, so it’s either no subtitles or no movie for my users.
Also, notice how I wrote “text based” in the title in parentheses? Why stop here? Why not include an OCR in the server and try to convert graphics based subtitles to text as well? You could even be smart about it and run it through a dictionary based on the specified subtitle language.
Again, probably best to put that behind a setting, but based on the number of subtitle related threads, it looks like many users would appreciate something like this.
4K HDR → 1080p SDR, where PGS subtitle is getting burned, without video buffering for 2-4s every 15-30s? Transcode speed never gets reliably above 0.8.
I have ubuntu, 5800X Ryzen, RTX 5000.
What processor will handle this?
" Image subtitle burning is limited to a single thread so it doesn’t matter what your overall CPU usage is but rather what a single core’s usage is (as well as memory bandwidth and bandwidth between the GPU). Your bottleneck is likely a combination of the above. So when you have 20% CPU usage, you likely have a single core that’s pegged at 100% doing this process.
To summarize, for each frame between the GPU decoding the frame and reencoding it:
The hardware decodes in a process parallel to the CPU
The CPU copies the decoded image frame out of the GPU. That’s going to be ~16-32MB (depend on the exact format used)
The CPU then has to burn in subtitles (a single threaded process)
The CPU copies this final image into the GPU for encode
That has to be done 24x a second (assuming you are playing content that’s 23.976fps). It wasn’t long ago that people were struggling to burn in PGS subtitles on 1080p content when they were fully capable of transcoding entirely in software when not burning in."
Is it that in step 1 HW decode is handing off a 4K frame instead of a 1080p frame? Of course, 1080p->1080p transcodes with PGS burnin is smooth as butter
As I just tested on my AS6604 a 1080p transcode using HW and burning in subtitles will max out a single core completely (25% CPU Load), yet only yielding 0.3-0.5 transcode speed, thus not usable.
When I DISABLE HW transcoding the SAME transcode will work using about 40-50% CPU, but yielding 1.4-1.9 transcode speed.
So, how can a HW transcode with subtitle burning be MORE ressource intensive than a SW transcode with subtilte burning?
Maybe there is some potential for optimizing?
As I don’t want to use SW transcoding for everything by default (for obvious reasons), is there anyway to force SW transcoding ONLY when there are subtitles to be burned?
Ok, that AS6604T has a Celeron J4125 processor, which is just barely powerful enough to handle 1 1080p transcode with all 4 threads. It’s not going to be able to handle subtitles very well.
And with HW transcoding, due to that slow CPU, transferring data around to burn the subtitles will be slow.
Unfortunately, there isn’t a way to automatically pick HW/no HW when subtitles are involved.