So, are P7 files playing correctly or not?
The RPU dynamic metadata provides hinting to the player or display regarding how best to tone map down to SDR, and optionally to lower-brightness HDR. It can also be used to hint color adjustments for a BT2020->P3 conversion, but in practice content is mastered within the P3 gamut anyway, so this won’t be relevant for some time. As far as I can tell from any available documentation, the RPU metadata isn’t used at all on currently-available content when the display is capable of reproducing the brightness of the original content, and in practice, content tends to be mastered at levels that most HDR displays are going to render fairly well without any extra hints (beyond the standard ones).
So far I haven’t seen anybody actually demonstrate a perceptible difference between RPU-hinted playback and regular HDR playback on actual content (of course it’s possible for this to happen in theory, but it’d require someone to master for a much higher brightness than your display, which again is quite uncommon for compatibility reasons).
The dual-stream (“enhancement layer”) aspect of Dolby Vision seems to also be very theoretical at the moment. The second stream theoretically provides additional precision, but in practice there are 3 major issues that prevent it from being particularly useful:
- Current consumer displays max out at 10-bit precision; they’ll be either dithering or truncating away the extra 2 bits anyway
- The EL is coded at 1/4 the resolution of the BL, so any additional information it does provide is effectively subsampled once in luma and twice in chroma; it’s not entirely clear how this is supposed to result in a useful quality improvement
- In practice, even media distributed with BL+EL Dolby Vision often doesn’t use the EL in any substantial way. I’ve seen commercially-released mainstream HDR BDs where the entire EL stream is 50%-grey across the whole image; this means that it’s signaling absolutely nothing to the DoVi implementation, and the exact same output could’ve been had without wasting all those bits on a blank stream.
As far as I can tell, it’s basically the MQA of HDR: a complex, secretive, lucratively-licensed coding scheme that ultimately provides no real benefit over the standard, but that the developer markets heavily while the tech press repeats their claims uncontested.
The dual-stream arrangement does have one theoretically-interesting application: backwards compatibility with SDR via a sort of inverse tone-mapping (i.e. profiles 4 and 9). This could allow a single piece of media to work normally on legacy SDR-only players and displays, but display in HDR on newer players and displays, at a much smaller filesize penalty than you’d get by distributing 2 entirely-separate copies. This doesn’t seem to have found a market in practice, though; every major use-case seems to have ended up landing on the “distribute separate SDR and HDR copies” route.
Meanwhile, Apple doesn’t provide any documentation on how to handle Dolby Vision content beyond the use of AVPlayer, which Plex generally avoids (since using our own player allows us to support more codecs and containers [e.g. MKV!], better subtitle rendering, higher-quality scalers, etc). It’s unclear how AVPlayer handles this content; are the RPUs parsed and their data handled within the player (this seems more reasonable, since it would allow for compositing), or are they sent to the display via HDMI (potentially resulting in any overlays being mis-mapped)? It’s all proprietary.
For profile-5 content, which uses a Dolby-proprietary color space with a murky patent situation and no current public implementations, we may need to fall back on AVPlayer (at the cost of reduced audio and subtitle codec support, and probably needing to remux). I’m going to need at least a couple samples to test this, ensure we can remux it correctly, and ensure that the app drops to AVPlayer for it. The levels involved don’t really matter; DoVi levels just indicate a maximum resolution, bitrate, and pixel rate (much like HEVC levels). Nothing Plex needs to do to handle these files is depending on those values, so any level should be fine.
As a side-note, I wouldn’t expect that there’s any particular hardware peripheral on the ATV or iPhones to decode Dolby Vision. There’s nothing in the decode process that requires specialized hardware; it comes down to 4 basic steps (depending on the profile):
- HEVC decode (possibly 2 concurrent streams if an EL exists; the regular decode ASIC can handle this fine)
- RPU parse (these are undocumented but there’s no reason this wouldn’t be handled on the CPU; parsing this kind of syntax isn’t particularly intensive)
- BL+EL combination (only if EL exists; this is essentially just a scale-and-add operation, which the GPU can handle just fine)
- Tone mapping to the display’s brightness using RPU hints (this might involve some exponentiation and division… that’s about it. GPUs do this.)
You could maybe speed up some of this a little bit with a dedicated peripheral, but I don’t see any particular reason to; I’d expect all the Dolby Vision support (including which profiles it does and doesn’t support) on tvOS is ultimately down to software.