Weren’t samples already shared and problem acknowledged by iOS team? Why is this still a discussion? I have said it before. Getting Plex to acknowledge bugs and problems and is huge issue. We’re at the 10th month mark of this issue.
Think we need to band together, we are not requesting something new here. Plex formerly advertised and acknowledged that DV is supported on ATV and is contractually bound to honor that. I’m at my wits end on this one. We have another running thread on HDR 10 on new ATV 2021. How can we band together and take up issue on social media or something?
I just think that this thread for months was misunderstood by those at Plex who were able to fix the issue. I think they thought we were asking for something that could not be supported, and so it just meandered on without resolution.
Agreed. This issue has been going on without proper attention for way too long. Does anyone know how we can get the attention of anyone at Plex in another manner to raise this issue?
I don’t think there’s confusion on our side at this point.
I summarized the expectations outlined in this thread in technical terms, above.
But I think your statement is adding some confusion:
Saying it should display fine leaves out the detail that, like @acosmichippo54 pointed out, if you play back a DoVi P7 file with Plex it isn’t displaying DoVi with the dynamic metadata from the RPU, it’s displaying just the HDR10 BL.
So it isn’t displaying DoVi at all when playing back a P7 file.
This thread is about getting DoVi P5 support working in Plex, because that is what the ATV4K hardware supports.
Since essentially no consumer DoVi encoders exist outside of hardware (e.g., an iPhone 12 recording DoVi P8.4), and those don’t prodice P5 files, it isn’t easy to provide sample files.
Are you able to acknowledge for us that you’re working on DoVi P5 support at various DoVi Levels? Can you provide some form of specifics on where you stand, what the challenges are, or what you need from us?
If you still need sample files for DoVi P5, what Levels do you need?
So, are P7 files playing correctly or not?
The RPU dynamic metadata provides hinting to the player or display regarding how best to tone map down to SDR, and optionally to lower-brightness HDR. It can also be used to hint color adjustments for a BT2020->P3 conversion, but in practice content is mastered within the P3 gamut anyway, so this won’t be relevant for some time. As far as I can tell from any available documentation, the RPU metadata isn’t used at all on currently-available content when the display is capable of reproducing the brightness of the original content, and in practice, content tends to be mastered at levels that most HDR displays are going to render fairly well without any extra hints (beyond the standard ones).
So far I haven’t seen anybody actually demonstrate a perceptible difference between RPU-hinted playback and regular HDR playback on actual content (of course it’s possible for this to happen in theory, but it’d require someone to master for a much higher brightness than your display, which again is quite uncommon for compatibility reasons).
The dual-stream (“enhancement layer”) aspect of Dolby Vision seems to also be very theoretical at the moment. The second stream theoretically provides additional precision, but in practice there are 3 major issues that prevent it from being particularly useful:
Current consumer displays max out at 10-bit precision; they’ll be either dithering or truncating away the extra 2 bits anyway
The EL is coded at 1/4 the resolution of the BL, so any additional information it does provide is effectively subsampled once in luma and twice in chroma; it’s not entirely clear how this is supposed to result in a useful quality improvement
In practice, even media distributed with BL+EL Dolby Vision often doesn’t use the EL in any substantial way. I’ve seen commercially-released mainstream HDR BDs where the entire EL stream is 50%-grey across the whole image; this means that it’s signaling absolutely nothing to the DoVi implementation, and the exact same output could’ve been had without wasting all those bits on a blank stream.
As far as I can tell, it’s basically the MQA of HDR: a complex, secretive, lucratively-licensed coding scheme that ultimately provides no real benefit over the standard, but that the developer markets heavily while the tech press repeats their claims uncontested.
The dual-stream arrangement does have one theoretically-interesting application: backwards compatibility with SDR via a sort of inverse tone-mapping (i.e. profiles 4 and 9). This could allow a single piece of media to work normally on legacy SDR-only players and displays, but display in HDR on newer players and displays, at a much smaller filesize penalty than you’d get by distributing 2 entirely-separate copies. This doesn’t seem to have found a market in practice, though; every major use-case seems to have ended up landing on the “distribute separate SDR and HDR copies” route.
Meanwhile, Apple doesn’t provide any documentation on how to handle Dolby Vision content beyond the use of AVPlayer, which Plex generally avoids (since using our own player allows us to support more codecs and containers [e.g. MKV!], better subtitle rendering, higher-quality scalers, etc). It’s unclear how AVPlayer handles this content; are the RPUs parsed and their data handled within the player (this seems more reasonable, since it would allow for compositing), or are they sent to the display via HDMI (potentially resulting in any overlays being mis-mapped)? It’s all proprietary.
For profile-5 content, which uses a Dolby-proprietary color space with a murky patent situation and no current public implementations, we may need to fall back on AVPlayer (at the cost of reduced audio and subtitle codec support, and probably needing to remux). I’m going to need at least a couple samples to test this, ensure we can remux it correctly, and ensure that the app drops to AVPlayer for it. The levels involved don’t really matter; DoVi levels just indicate a maximum resolution, bitrate, and pixel rate (much like HEVC levels). Nothing Plex needs to do to handle these files is depending on those values, so any level should be fine.
As a side-note, I wouldn’t expect that there’s any particular hardware peripheral on the ATV or iPhones to decode Dolby Vision. There’s nothing in the decode process that requires specialized hardware; it comes down to 4 basic steps (depending on the profile):
HEVC decode (possibly 2 concurrent streams if an EL exists; the regular decode ASIC can handle this fine)
RPU parse (these are undocumented but there’s no reason this wouldn’t be handled on the CPU; parsing this kind of syntax isn’t particularly intensive)
BL+EL combination (only if EL exists; this is essentially just a scale-and-add operation, which the GPU can handle just fine)
Tone mapping to the display’s brightness using RPU hints (this might involve some exponentiation and division… that’s about it. GPUs do this.)
You could maybe speed up some of this a little bit with a dedicated peripheral, but I don’t see any particular reason to; I’d expect all the Dolby Vision support (including which profiles it does and doesn’t support) on tvOS is ultimately down to software.
I’ve tested DoVi P7 files in MKV and MP4 containers and haven’t found any issues displaying the HDR10 BL with the correct colors in the correct BT.2020 color space.
The EL seems to be completely ignored, so no RPU and no FEL with 12-bit color.
So, given the ATV4K hardware, that’s as expected.
My understanding is that the MaxCLL/MaxFALL for HDR10 content will be applied according to a tone mapping curve that’s unique to each display and display processing pipeline.
Having RPU metadata means having MaxCLL/MaxFALL plus additional metadata to help standardize and improve the display of every scene or frame, to continually adjust the tone mapping curve to produce the desired color and brightness intensity.
So even if the content specifies MaxCLL of 1000nits and MaxFALL of 400nits on a display capable of 1000nits, that doesn’t mean the max brightness produced when playing back the content will be 1000nits.
It will be different on every display, and is not expected to be a linear curve, so the way each light level is displayed between the min and max will be customized according to the preferences of the manufacturer and the capabilities/settings of the display.
Hence, LG Display manufactures the same WBC and WBE OLED panels for LG Electronics and Sony Electronics displays, but the processors, processing pipelines, and development teams make different decisions about how to display the content, so the same content will appear differently on different displays using the same panel.
So I’m curious, when you say the RPU isn’t used at all, where are you seeing that?
When you say the RPU isn’t used “when the display is capable of reproducing the brightness of the original content”, what do mean — if the MaxCLL is less than the capability of the display?
Because, in addition to the benefits above, I’ve certainly seen plenty of cases where the MaxCLL exceeded 1000nits for a target display of 1000nits, and plenty of cases where the MaxCLL exceeded the max brightness capability of my display. So are you acknowledging the RPU would then be valuable in those cases, or am I missing the point?
I don’t disagree that there is a gap between the capabilities of DoVi and what current consumer TVs can display by way of 12-bit color and brightness.
But for DoVi P7, the EL carries the RPU NALUs, so it serves a practical purpose today. Whether there is a perceptible difference between content with a FEL vs MEL is probably a debate for elsewhere.
I don’t doubt what you’re saying may be true, but the purpose of the EL isn’t to directly integrate it’s decoded picture at 1/4 resolution into the decoded picture of the BL. The purpose of the EL is (1) to carry the RPU, and (2) to optionally provide an additional 2bpc of color (in the case of FEL).
How Dolby integrates a 10-bit HD HDR10 4:2:0 EL into a 10-bit UHD HDR10 4:2:0 BL to get 12-bit UHD DoVi 4:2:2 video is part of their proprietary “secret sauce”.
The FEL is also only 1/4 resolution for P7 with UHD content; for HD content, it’s 1:1 with the BL resolution. Perhaps for P7 UHD content the subsampling you’re saying is required, but that’s their system.
My understanding is that the ATV is using LLDV (low-latency DV, a/k/a player-led DV), where the RPU NALUs would be parsed and processed by the ATV. I don’t know if the ATV is capable of display-led DV. That said, it’s doing more than sending the decoded picture over the wire, as it’s signalling player-led DV content is being sent over the HDMI connection and triggering the DV mode on the display.
So with some players capable of allowing display-led DV and some displays capable of display-led DV, it is possible for some systems to negotiate display-led DV and send the EL+RPU over the wire for the display to perform the tone mapping.
HDMI tunneling uses the sink led [read: display-led] Dolby Vision HDMI interface. Most Dolby Vision TVs are compatible with sink led HDMI. However, Sony and Panasonic TVs 2020 and older only support the source led [read: player-led] HDMI interface. These TVs are NOT compatible with HDMI tunneling.
So a system like ATV + AVPlayer + Plex would have to be flexible to support different aspects of the DoVi standards.
That’s exactly what DoVi does to implement a minimal enhancement layer (MEL) when it’s carrying only the RPU and not the full enhancement layer (FEL) video stream. The video data is 0 bytes and “the MEL consists of Dolby Vision composer and content metadata of a mid-gray flat-field video sequence, carried in a Network Abstraction Layer (NAL) unit”. (See Annex II of the spec.)
Many DoVi UHD BD releases use MELs, so that’s not surprising to see.
I don’t disagree that DoVi P7 is generally inefficient as a format. I’m sure there were many concessions made for low-powered UHD BD players, and that’s why the other DoVi profiles eschew the separate EL.
It sounds like fallback to AVPlayer would be a good approach.
There may be nothing for Plex to do differently in handling them, but it seems there may be differences in the DoVi Levels that matter to the decoder, where L6 works but L5, 7, 9 do not.
I don’t know if that ultimately gets down to ATV hardware limitations or what.
Meanwhile, using the Plex app for Google TV on a 2021 Sony TV (capable of display-led DV), I can play a dvhe.05.07 file with the correct colors that triggers the DV display mode on the TV.
So something about the way ExoPlayer in Plex is able to trigger display-led DV allows a P5 L7 stream to play back correctly, when there may be some hurdles to playback on ATV.
That’s interesting to hear. I wouldn’t be surprised if that’s true, and a deliberate choice by Apple to implement only those profiles needed for OTT streaming services, including their own.
So in your experience developing for Plex on ATV, you’re saying you’d expect it’s possible for the A10X/A12 SoCs to decode a second concurrent HEVC stream for an FEL in hardware?
For source-led handling, does that mean the display would have to signal detailed profile information over HDMI/DP/etc? Or is that kind of thing only available to sink-led mapping?
Right; in that case I don’t think the RPUs get any substantial use at all. In theory you might have some adjustment based on the mastering-display primaries, but those are generally static anyway (and can be handled in the player by mapping the static metadata to the display’s ICC profile).
Yes, it could be useful there, though almost certainly less-so than you’d think from Dolby’s marketing. Let’s take a look at what the UI for Dolby Vision mapping customization looks like in Resolve:
There are 3 color space options here; nearly all HDR content is produced in P3, so we can ignore the larger BT.2020 space and focus on that and the BT709 backwards-compatibility space. Then we see 4 peak-brightness options: 100 (SDR), 600 (low-end HDR), 1000 (mid/high-end HDR), and 2000 (studio HDR) nits. Then we can apply adjustments on a per-scene or per-frame basis for each of these output profiles, with these and a few other controls:
Any display that falls in between these profiles is just going to get an automatic interpolation between the two nearest populated trims (commonly the master and the SDR values). There’s no manual trimming going on to handle displays that don’t fully cover P3 space, nor for ones below 600 nits. I’m not sure what a display could possibly do with this information to map specifically to its own precise characteristics in a particularly customized way.
This is actually one of the few parts of Dolby Vision that’s been documented in a published standard (ETSI GS CCM 001), and the combination of the two streams is a literal addition!
Thanks, I’ll give these a spin.
I’ll note I’m a server developer, but I’d expect these devices to have no problem with this, yeah. They need support for multiple concurrent streams for other purposes (e.g. video conferencing). Hardware decoders usually have a few overall limits, and can process multiple streams as long as they don’t exceed them in total; e.g. data rate, macroblock rate, total DPB size… For instance, on Intel GPUs, 1080p decodes tend to run at a few hundred fps, and that speed gets ~evenly divided amongst however many sessions you have running. Apple’s implementation might work a bit differently, but I’d expect the fundamentals to be similar.
Not to my understanding. The tone mapping and whatever other processes specific to the profile would have already been applied by the source and now it’s just signaling DoVi content is present.
MaxCLL and MaxFALL are optional values stored in the DoVi L6 metadata or outside the DoVi metadata entirely, so they aren’t supposed to be used in DoVi playback at all.
They’re there purely for HDR10 compat when a device can’t display DoVi.
There are no global, static equivalents to MaxCLL/MaxFALL I’ve seen in the DoVi metadata that are used for DoVi playback; there are only the dynamic values per shot or frame.
So to legitimately play back DoVi content with the DoVi metadata in DoVi display mode, my read is that the RPU must be parsed and applied continually, even if there are no changes in the trim values.
Interesting, thanks for the reference. I’ll have to dig into the spec more. That doesn’t jibe with other discussions and interviews I’ve seen, but perhaps that was Dolby marketing.
So doesn’t that imply that DoVi mapping isn’t doing anything particularly magical based on the deeper intricacies of the panel response, and instead only operates based on the fairly basic information included in that VSVDB (essentially color primaries and pixel brightness range)?
P7 is forced to use a separate EL in order to maintain compatibility with the Blu-Ray spec. They couldn’t change color encoding of the primary track, and didn’t dare add additional NALUs to the primary encoding. Those are huge constraints.
The RPU is stored on disc in the second track (along with the EL) for the same compatibility reasons. But the RPU isn’t otherwise “part of” the EL.
P7 isn’t meaningfully superior to P5. The ICtCp color space of P5 achieves adequate banding avoidance with 10-bit color, and is superior in other aspects. P7 requires the FEL’s 12-bit expansion to achieve equivalent visual fidelity.
I believe that’s correct as well. In the case of LLDV, it’s the player performing the tone mapping, so it’s the player’s tone curve and application of the metadata based on the limited info available about the display.
But as there is a community of folks spoofing EDID in order to get their players to engage LLDV instead of HDR10, even LLDV has some desirable effects on picture quality that some people are seeking out. Like most things, it may just come down to a preference for the tone mapping the Dolby algos produce.
Regardless, the superior experience is expected from standard DV, not LLDV, where the application of the metadata can be much more bespoke for the panel/processor.
I don’t have an HD Fury or other device to know for certain that an ATV4K hooked up to a display capable of standard DV isn’t actually using standard DV. I’m taking the reports that the ATV always uses LLDV at face value, and perhaps they’re wrong or weren’t performed with a compatible display capable of standard DV.
Like other things mentioned here, it seems it could change over time in software, too.
Backwards in what sense, that the design of P7 is the result of other considerations and not concessions for low-powered UBPs?
I agree that there were numerous design considerations. I don’t think that negates that there were considerations for processing power. I guess I wasn’t attempting to assert that that was the primary consideration.
Whatever the considerations, in total, the point I was trying to make in response to Ridley was simply to agree that there are more efficient DoVi profiles in their overall use of bits than P7. I think that P5 is definitely more efficient with bits than P7, so if you’re suggesting something else, then I guess I’d have to understand more about what we disagree on.
I generally agree with everything you posted, so I’m not seeing any particular points of disagreement with what I said.
With different target uses/devices, it’s difficult to compare apples to apples.
When P5 content using IPT/ICpCt is largely used by OTT streaming services at lower bitrates than UHD BDs with P7 using BT.2020, most of us (including myself) haven’t really had an opportunity to compare. I assume at a given video bitrate, an IPT encoding could be superior, but I don’t have a good mental map of how each would affect the overall color volume.
The only practical application I’m aware of for P7 is UHD BD, so not that I’ve seen in the wild.
I understand there are other potential uses for streaming services to use a base HD or SDR stream with P7, but I don’t know who might be using it that way.
My point was just that the spec doesn’t limit the EL to 1/4 resolution in all cases, as if it’s a feature.
I think that was potentially true, but more marketing than real-world truth. TV processors and panels are already tuned for their characteristics, and in the high-end case, individually calibrated.
It’s interesting that standard is no longer required, but LLDV is. Does that mean Dolby wants a logo on more devices? Or that LLDV is good enough? Probably both.
I see comments elsewhere that the ATV4K can do both modes, but may prefer LLDV. My intuition is that a device with on-screen menus might prefer LLDV when it’s available.
You used the word “eschew” in a way that might have implied that an EL was superior to the single-layer formats. I was being pedantic too.
I agree that the higher bitrates available on UHD BR are an advantage over most streaming media sources. Audio options, too! But the DV tech isn’t necessarily superior.
To be clear, decoding 2-layer DoVi is strictly and unambiguously more complex than single-layer; it increases CPU + decode ASIC + GPU [or VPU or whatever in embedded-system cases] usage.
IPT seems to be the only legitimately-useful consumer-facing innovation DoVi has that it doesn’t share in common with HDR10+ (which has much more reasonable licensing!) and other dynamic-metadata solutions. From everything I’ve seen, it does seem to legitimately reduce the negative impact of chroma subsampling on video quality (though I’m not sure by exactly how much, or how noticeable it ends up being at 4K, where the chroma resolution is already very good). It won’t be widely adopted in broader video usage, of course, because of Dolby’s patents on it. Long-term I’d expect it to ultimately be replaced by a shift away from chroma subsampling in video encoding, but that’s still years out and will require both more-efficient codecs and increases in memory bandwidth.
A logo, and also those sweet sweet license-program royalties…