All color is best-effort

All color is best-effort

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI

Оглавление (4 сегментов)

Segment 1 (00:00 - 05:00)

I do not come with you today with answers, but rather some observations and a lot of questions. Recently I was editing some video and I noticed this. Not what the finger's pointing at, the dots. Here are the separate layers that this image is made up of. The background is a stock image I've licensed from Envato Elements. Because I use it as a background image, I've cranked down the exposure in the color tab and for a little added style and some additional readability for subtitles, I've added a tilt-shift blur. On top of that, we have some text captured from Safari as a transparent image by right clicking on an element in the inspector and picking capture screenshot, one of my favorite tricks as of late. However, the transparency is not complete. GitHub's CSS has tables with an opaque background, so I added an additional 3D keyer to remove the background. Those two layers composited already show some strangeness. bear: Oh, I can hardly see anything! Very well, enhance! It's still kind of subtle, but you can see the dots. Are those dots present without the 3D keyer? No, they're not. They're also not here on the Fusion tab at all, the FX suite that ships with Resolve. The dots are bad when playback is using full resolution. but things get a lot weirder once you scale down. This is half resolution and this is quarter resolution. But before we dig into it, let's hear a word from today's sponsor. There is a war going on for your mind. Every day on your phone you're exposed to a barrage of information. Now it's a funny cat picture, now it's a piece of news, and now it's the pictures of your cousin's wedding. In this war you have few allies. Brilliant is one of them. Brilliant will help you reclaim what's yours. re-establish dominion over the confines of your mind. The platform lies in wait, ready to bestow upon you thousands of high-quality lessons designed by experts from renowned universities around the world, including interactive exercises which have been proven to be over six times more efficient than classic lectures. For a prize... The blood of your first born shall be offered in payment. Ah, so like a goat or a sheep— oh /no/ blood sacrifices, just the monthly subscription. Yeah, that makes a ton of sense actually. Here's the thing though, I'm not sure I'm gonna have time to reshoot... the entire ad. Soooooo... Stay one step ahead with Brilliant's course on building a language model. All the way from the very basics of tokenization to building your own simple language model. With Python code you can execute right there in your browser. Everything you need to sharpen your mind is right there. Go to brilliant. org slash faster than lime to get a free 30-day trial of everything Brilliant has to offer. You'll also get 20% off an annual premium subscription. I have investigated this for half a day and I have good news and bad news. The good news is unpausing makes all the artifacts disappear. Oh great, so surely that means it's not present in the export, right? Well bear, what do you think the bad news are? Even if I consider those artifacts acceptable, the tilt shift blur effect makes them impossible to ignore. It blows up every single point into those large colored circles, even with a strength of zero for the blur, it turns those dots into streaks that are reminiscent of memory corruption. Disappointingly, I don't know what is causing this particular problem. There's a whole bunch of things that don't solve it. The 3D keyer can work in different color spaces, and it has a bunch of knobs you can turn. But none of them truly fix it. The other keyers exhibit the same issue. Here's the ultra keyer instead. Same with the luma keyer, the chroma keyer, etc. No amount of checking pre-divide and post-multiply makes a difference. Manually adding an alpha multiply node does jack sh*. Same with gamut limiter, there goes my theory that it's just outputting colors outside the gamut, but that because it might be using f32 values internally, something happened, but nope, no evidence. Bear, it's in the most vanilla color space you can imagine. sRGB. What's with the numbers? Oh that's just the IEC standard for it. Among other things, the standard, published in 1999, amended in 2003, specifies the proper transfer function to use, which is almost like gamma 2. 2, but not quite. Close enough though, even if some part of the pipeline used simplified sRGB by doing gamma 2. 2 instead of using the IEC curve, we wouldn't see that. Well, what would we see? In that case, probably nothing, but when we use the wrong color profile, things look, well, wrong. They can look washed out, they can look too saturated, too dark, too bright, any of these. Believe it or not, HDR is the main reason I'm using iPhones all around now. By default, my iPhone 14 with the camera app will shoot HDR video, you can disable it in the settings. But personally, I prefer shooting with the Blackmagic camera app for iPhone, which calls color spaces by their actual names. I took four short videos in quick succession and dragged them all into DaVinci Resolve just to find out what would happen. They look different. Indeed they do. First off, there's most likely an exposure and white balance mismatch between the built-in camera app, which had everything on auto, and the Blackmagic fixed. But even looking at the three pieces of Blackmagic footage, things look different, even on this sRGB screenshot. If we switch to Resolve's Color tab, we can see which colors are actually used in each image via the CIE chromaticity

Segment 2 (05:00 - 10:00)

diagram. The horse shoe shape in white represents the visible spectrum, and the triangle labeled "Rec. 709" represents our color space, which colors of the visible spectrum we're able to represent or encode in a video file. The circle is the white point, Rec. 709, P3-D65 and Rec. 2020 are all using the same white point, which is called D65, which is intended to represent average daylight and has a correlated color temperature of approximately 6500 K. CIE standard illuminant D65 should be used in all colorimetric calculation, requiring representative daylight, unless there are specific reasons for using a different illuminant. Without agonizing too much about the science of it, we can see a couple interesting things. The Rec. 709 feels very noisy for some reason. I have no idea if it's normal or not, but we don't have the same thing in the P3-D65 footage, so maybe it's an encoding artifact, I'm just not sure. The thing that's hard to ignore is on the Rec. 2020 chromaticity diagram, much like if you crank the gain on an audio track, you can see it clip, you can see the peaks being flattened to the minimum and the maximum representable value, as if clipped with scissors. Similarly, we can feel that our Rec. 2020 footage feels a little cramped in our Rec. 709 color space, it wants to get out. But that's not even what feels wrong with this comparison. The Rec. 709 footage looks less wrong to me, the Rec. 2021 has some colors that look off, the floor is too bright, the cat's fur becomes too bright too quick Right, the whole reason we have a gamma curve in the first place is to spend our bits wisely. I know it's more complicated, don't crucify me in the comments, just watch the whole video. If all you had was 16 shades of gray, sorry E. L. James, which one would you choose? Linear or sRGB? I'll take the sRGB ones any day. Exactly, with a linear progression in terms of light emission, the shades become too bright much too quickly. And that's what the curve I showed earlier is all about. Reading this graph, we can see that an encoded value of 0. 1 is barely equivalent to 0. 01 lightness. To obtain 0. 1 lightness, we have to go all the way up to 0. 35. If we weren't using gamma curves like these, colors would look a lot worse, especially given how long we've been using 8-bit color. By which I mean only having 256 different values per color channel. This is gonna get a little technical. Don't run away. We're in this together. Before things get a lot more confusing, let's ask ourselves what is the exact name of the transfer function we just plotted? As far as I can tell, to the best of my don't kill me, it's a reverse OETF. An OETF, Opto-Electronic Transfer Function, is used when capturing or encoding an image. The camera sensor detects some amount of light, and we need to figure out which integer value to encode it as in the video signal. An EOTF, Electro-Optical Transfer Function, is used by monitors and displays, which take a signal and have to decide how much light to emit for each integer value of the signal. More or less. Finally, an OOTF is the composition of the OETF and the EOTF, and we do not need to worry about it at all. You can check the Wikipedia page on transfer functions in the imaging, if you really must. Really, the only one we have to care about is the OETF, which has been done in our camera, and that we need to reverse in order to know how things were. For sRGB, the OETF is defined like this. Lowercase v is the lightness value, how much light hit the camera sensor, and e of v is the encoded signal, the RGB value we put in the video file. And so we can define the reverse OETF like this. Now how do we know that we got the inverse right? Well, it's not very rigorous, but we can plot them both on a graph and see the symmetry across the y equals x axis. It looks a little bit like an almond. Another way is to do d of e of v and simplify to see if we fall back to v, first with the linear part, and then with the exponential part. And yes, we do, because this video is scripted. Back to our footage, there's another fun color visualization we can look at, the parade. This displays red, green and blue values from left to right, here on a scale from 0 to 1024 to mimic 10 bit encoding, although you can configure that. Our Rec. 709 footage, which looks good, has a good spread. Our Rec. 2020 footage, however, seems squished at the top of the scale. Luckily, there's a slider for that. But here's the thing, the footage looks fine when I review it on my iPhone. It's not blown out or anything. Good question, I airplayed that screenshot to myself and got an sRGB JPEG file. It definitely looks more vivid on my iPhone screen when I actually review the footage than on the screenshot. I'm not actually sure iPhones can take HDR screenshots. But the point is, the iPhone did something to that Rec. 2020 footage, it's showing more colors to me on the iPhone screen, but it's able to save an sRGB version as a screenshot that doesn't look as wrong as what I had when I dragged the video file on my DaVinci Resolve timeline. In fact, my Mac can do that too, opening the footage in QuickTime shows "correct" or "less wrong"

Segment 3 (10:00 - 15:00)

colors as well. And what that is, is tone mapping. How do we know for sure how color is encoded into our H. 265 files? The macOS Finder gave us three numbers for each of those files. 1-1-1 for Rec. 709, 12-1-6 for P3-D65, 9-8-9 for Rec. 2020 HDR. Those numbers tell us all we need to know because they're standardized in the ITU-T H. 265 recommendations, which is a free download and I have gone through its 728 pages to fish out all the relevant information. The first number is for color primaries. Notable numbers include 1, 4, 5, 9 and 12. Number one is for Rec. ITU-R BT 709-6 and sRGB, so high definition but standard dynamic range video content. Four is for BT601-625, so PAL countries, Europe and whatnot. Five is for NTSC, so only 525 vertical lines. Nine is for BT2020-2 and BT2100-2, two, so HDR content, high dynamic range. And 12 is for SMPTE-ST2113, quote, P3D65, which sounds familiar. And those color primaries are the X and Y coordinates of what the color space considers pure red, pure green and pure blue in the CIE 1931 color space. Again, the horseshoe is the entire visible spectrum and the triangles are color gamuts that represent which colors we can actually encode as integers, as video signal. The Rec. 2020 gamut is larger, it covers more of the visible spectrum than Rec. 709 and that's why the cloud of points in this chromaticity graph feels cramped. It's actually Rec. 2020 footage that we've clamped down to Rec. 709, we have to throw away some of the colors that were encoded it into the original footage. Wait a minute, I have several questions. Go right ahead. First, why didn't we just pick a super large gamut that covers the entire visible spectrum? Why is Rec. 709 so small to begin with? Okay, the short answer is 8-bit color. We talked about spending our bits wisely and I think it's time we visualize it. Without gamma curves, all our possible encoded colors are concentrated near the illuminant, near the pure white. And there's a lot of gaps near pure red, green and blue colors. But if we use the gamma curves, you can see that all the discrete colors spread out away from the illuminant and closer to the color primaries. This is valid for Rec. 709 and Rec. 2020 just as well. And that's what the second number is for, TRC for transfer characteristics, which is really transfer functions or tone response curves or gamma curves or yada yada. Well, it's not that simple. Note 5 says some values of TRC are defined in terms of a reference OETF and others reference EOTF because they've been standardized in different ways. In the cases of Rec. 709 and Rec. 2020, even though the value is defined in terms of a reference OETF, so optical to electrical, a suggested corresponding reference EOTF characteristic function for flat panel displays used in HDTV Studio production has been specified in Rec. ITU-R BT. 1886-0. Bless me, here's the EOTF in question. So this is used to convert signal to how much light a display should emit. V, capital V, is the input video signal level, V for video, in the range 0 to 1. So far so good, we can normalize our data to 0 and 1, that's fine. L, capital L, is the screen luminance in candelas per square meter, also called nits, you might be more familiar with nits, and gamma is 2. 4. That's just a lowercase greek gamma. a is user gain, previously known as contrast, if you have a contrast control on your display, it's doing that operation. And b is user black level lift, which is also known as brightness. Finally, L, W and L, B are screen luminance for white and black, respectively, also in candelas per square meter or nits. So yeah, what we plotted earlier isn't the EOTF, the EOTF is for displays. What we plotted was a reverse OETF to go back to what the camera captured. So notable transfer characteristics values for H. 265 include 1 for Rec. 709, they give this definition for the OETF. How do we know it's an OETF? Because its input is L of C, a linear optical intensity with a nominal real valued range of 0 to 1. Value alpha and beta are constants defined for the curve segments to meet the breakpoint, we've seen that before. For TRCs 1, 6, 11, 14 and 15, we have these values. Sure, let me just rewrite the OETF in a way that's a little more consistent with our sRGB work from earlier, like so. Yeah, only with the different slope A for the linear bit, a different breakpoint V, a different constant C and gamma 2. 2 repeating rather than 2. 4. Here's a plot of the sRGB gamma curve zoomed around 0 and 0. 01. And here's the Rec. ITU-R BT709-6 curve, which we'll hereafter lovingly refer to as just BT709, BT for broadcast television. Moving on to HDR, value 18 refers to ARIB STD B67, also Rec. ITU-R BT2100-2 HLG, which we'll simply refer to as HLG

Segment 4 (15:00 - 18:00)

moving forward for hybrid log gamma. Definition given in the spec version 10 is this monstrosity, with pleasure here. Well, that's because in Spirit, lightness levels are no longer between 0 and 1, they're between 0 and 12. That's right, HDR is not just 10-bit or 12-bit color, it's also brighter colors. There's a cool demo called "Wanna See a Whiter White? " that embeds in HDR video to show you that when you say #FFFFFF on a webpage, or rgb(255, 255, 255), you're really referring to the brightest standard dynamic range white, which really depends on your display. Me, I have a pair of LG computer displays at home, which support the VESA display HDR 400 badge, meaning they should be able to reach all the way to 400 nits, remember, nits are just candelas per square meter, of course, and they've been tested to reach 413, so that's cool. But that's the absolute highest they'll go. Right now I have them using the DCI P3 color profile, which we haven't seen before, and I have the high dynamic range option left unchecked in macOS's display settings. So, homework, how white is my white? I don't actually know, because I don't have a luminance meter like the Konica Minolta LS150. So I can't measure how much light my screen puts out, I can only compare various shades with my eyes, which adapt quickly to various lighting conditions. In fact, that's precisely why sRGB is traditionally displayed with a gamma of around 2. 2, suitable for a typical office setting. Rec. 709 is typically displayed with gamma 2. 4 for darker living rooms in the evening, and DCI-P3 is displayed with gamma 2. 6 for near-dark movie theaters. Because DCI means "digital cinema initiative". You can see the three different curves here. The higher the display gamma, the deeper the blacks. In other terms, it improves contrast. And that's where things get confusing. This display gamma is not the same as the encoding gamma that's used in the OETFs we've seen. The sRGB OETF is pretty close to a gamma of 2. 2. The Rec. 709 OETF, even though it uses gamma of 2. 2 repeating for the exponential part, is overall better approximated by gamma 1. 96. The gamma used in OETF, again, is really just here to make sure we make the most out of our bits. Whereas, the display gamma is usually adjustable on computer monitors and TVs to suit the overall lighting environment. It's all about contrast. And in the cinema, they do the opposite, they control the environment to match the screen. Gamma is such a fascinating operation for me, because it always maps back to 0. 1, as opposed to something like lift or gain. I have had a lot of fun making these interactive widgets, you can go and see them on my website right now if you're a patron 10 euros per month or above. If you're not, you can wait six months for the full episode. Don't go anywhere, there's a conclusion. Because I left this draft in that state for months, like working on other projects in the meantime, I kind of forgot about it. And at some point I thought I discovered that if you tell DaVinci Resolve to use an HDR intermediate color space while editing, then the stars were gone. But I never could really put my finger on what was going wrong. And then it happened again. Like before, it's barely noticeable in full resolution, and impossible to ignore in half resolution, I discovered that in the color tab, increasing the gamma slider makes the problem lesser, although it also changes the colors, and decreasing the gamma slider makes the problem much, much more visible. I even found a way to isolate just the artifacts, so you can see a night full of stars. My conclusion was almost that if you use a gamut mapping node, not a gamut limiter node, then it works around the problem, since that's how I worked around the problem for my unsynn video. But you know what? It doesn't work anymore. My workaround workaround'nt. And so the mystery remains complete. It's probably a floating point thing, but which one? Who knows? Not me. Bye! *festive music trumpeted by mouth*

Другие видео автора — fasterthanlime

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник