As soon as I enable it, my NAS fallback to SW transcoding and it can’t keep up. According to the documentation it should work out of the box; as I am using docker. Are there any prerequisite in term of Quicksync version? I am pretty sure that I read somewhere that HW tone mapping was only supported starting Ice Lake or something around those lines. I am myself using a gemini lake based CPU J4125, while it is not the greatest & latest it is not that old either.
Last but not least, I did force the usage of the i965 driver; because of the transcoding bug plaguing gemini lake GPU. Is this still needed, or was the underlying bug also fixed by that new release?
Nevertheless, this is an awesome feature. As far as I can tell, plex is a precursor in that field, huge thanks !
(File removed)
+1 for me as I am using Synology NAS and can’t install this via terminal. This need to be provided with the server SW package. I am using the PLEX Server app available to install via NSD managment UI.
**admin@PanchiDS** : **/** $ sudo apt install ocl-icd-libopencl1 beignet-opencl-icd
Password:
sudo: apt: command not found
**admin@PanchiDS** : **/** $
you have to read the documentation of your NAS. synopkg expects a path to the installation package. Not sure if you have any other alternative like pkg or ipkg that works like apt. Or use the GUI.
Okay, I just saw that the docker base image was bumped to Ubuntu 20.04. Since then I can report that it now works ! I get HW transcoding & tone mapping.
For everyone running on a NAS (actually - everywhere), I’d recommend using Docker - it just works.
Still, mine was also struggling in some cases (i.e. 4K to 4K transcode).
During my brief testing, it worked ok with 4K 60mbit content getting transcoded to 1080p 15mbit. What surprised me is that the cpu never seemed to be the bottleneck. Even when the transcoding was not fast enough, cpu consumption from the transcoder was below 10%
If I may explain some of the resource bottlenecks you might run into on NAS processors (e.g. Celeron J3xxx and J4xxx) –
the OpenCL process does the following:
Read a video frame from the file
Uses the Intel ASIC to decode the HEVC HDR to a raw image
Launches the OpenCL task to process the raw image
Uses the Intel ASIC to encode the raw image back to H.264.
As you can see, there is a lot of activity on the data bus.
33,177,600 bytes per image of 2160p (3840 x 2160 x 10 bit)
Multiplied out to 30 fps (base of 29.97 assumed)
995,328,000 bytes / sec == 949 MB/sec
The J4105 is spec’d at DDR4/LPDDR4 upto 2400 MT/s
Add to this the normal load of running PMS and the rest of the NAS (technically ‘noise’ but it is there to contend with as part of the overall loading)
I’m simply sharing that there do exist CPU-memory limits and this is “big data” to these smaller CPUs which, in Synology’s case, must also continue to run the mdadm software RAID.
@ChuckPa
May you explain how this works on the native App on my DS918+.
Didnt do anything and Tonemapping (HW) worked. But with Highbit 4K HDR - transcoding down to 3 Mbit is it. Anything higher starts buffering.
As Iv’e stated in the DS918+ and other Synology threads:
None of the Native Synology apps have OpenCL tone mapping. By extension, no native NAS apps have it.
The only native support for tone mapping is Ubuntu 20.10 based. LinuxServer.io’s updated Docker image contains that support (at our request)
RPM (Desktop Redhat Linux) does not have native support yet.
I will be adding native tonemapping support to all the NAS platforms as quickly as Engineering completes building the Beignet & OpenCL libraries internally. (it is a big task to complete for all our platforms).
The next, obvious, question I am often asked: “Why only Ubuntu?”
The answer is easy: Engineering wanted to give a small gift to as many people as possible by releasing this a bit earlier.
Lastly,
Optimization uses tonemapping - oddly enough
I will forward about thumbnails. I do ask patience as this is a BIG feature jump and, as said above, only partial deployment at this time.
Adding a ram stick might then improve performance? NAS typically comes with 1 populated stick, on a dual channel CPU. This double the theoretical bandwidth.
To elaborate on my perf report, I am taking advantage of dual channel already.