# The science of loudness

## Метаданные

- **Канал:** fasterthanlime
- **YouTube:** https://www.youtube.com/watch?v=98NB0BcaZTI
- **Источник:** https://ekstraktznaniy.ru/video/37524

## Транскрипт

### Segment 1 (00:00 - 05:00) []

This video is sponsored by Curiosity Box. My watch has a noise app. It shows dB for decibels. My amp has a volume knob, which also shows decibels, although negative ones this time. And finally, my video editing software has a ton of meters, which are all in decibel or decibel adjacent units. How do all these decibels fit together? Are the decibels from my watch and amp related? And if so, how? I've decided to spend 20 minutes of your time answering that question. I also spent a month of my time building my own loudness meter in Rust. It shows RMS, sample peak, true peak, loudness units and stereo waveforms. So if you want to find out more about how these are measured, how you can mix the rodio crate with egui or make custom UI widgets, the source code is available right now to patrons of all tiers. As for the article version of this video, which goes a little more in depth in the maths and that you can read at your own pace, it's available right now on my blog for free for everyone. The link is in the description. Sound, like wind, is more of a concept than a thing, since it's the name we've given to a specific behavior of particles. When you strum a guitar, the chord vibrates, transferring energy to the body of the guitar, which amplifies it and projects it into the air as a pressure wave. This wave eventually makes its way to the ear, where some processing is already being done by its unique mechanical design. After being collected by the outer ear, the wave is ferried through the ear canal, into the eardrum, where three tiny bones amplify it. Then its destination, inner ear, where hair cells travel up and down some fluid. Eventually it's converted by chemicals into electrical signals and interpreted by the brain as sound. People experiencing hearing loss and who use a cochlear implant bypass the mechanical parts of that pipeline, relying instead on microphones and speech processors. Although those implants do not replicate natural hearing, they give the brain enough information to recognize and process human speech and environmental sounds. For the rest of us, our ears detect tiny changes in pressure. You have to realize that there is pressure constantly being applied to our bodies. On the order of one atmosphere, or about 100,000 pascals, the SI unit for pressure. But you'll notice that my Watch's noise app doesn't use pascals. In fact, no audio equipment I've ever looked at uses pascals. Instead, they show sound pressure level, defined as follows. Decibels are a love-arithmic unit expressing a ratio. In this case, the ratio between p, a pressure we measured, and p0, a reference pressure. In air, because sound can also travel through water and other media, we usually pick 20 micropascals for p0, which is about the quietest sound the human ear can detect. Instead of having a linear scale that spans eight orders of magnitude from micropascals to the thousands, decibels give us a nice human-friendly scale, going from 0 decibel to 194 decibel. Above 194 decibel, we don't get sound waves, we get shock waves. The pressure amplitude would need to be more than one atmosphere, resulting in negative absolute pressure, which is impossible. Once you reach a vacuum, that's it. There's no going anymore. Vacuumy. We haven't yet elucidated what the decibels on my amp mean. I would call those dBFS for decibel relative to full scale, because at 0 dB, we would get the absolute maximum power the amp can output, which would be damaging to my ears and to my relationship with the neighbors. The formula is the same, except that we don't pick a reference p0, like with sound pressure levels. Here, we consider an input signal and an output signal. Say the solid curve here is our input signal, with amplitude 1, and the dotted curve is the signal after it comes out of some system with amplitude 0. 2. Based on those amplitudes, our system has a gain of -14dB. Those dBFS are the ones we're interested in as broadcast engineers, since we're dealing with a signal directly. Of course, most of the time, a signal will be transformed back into a sound wave, and then we'll have to worry about db SPL But as long as it's within an audio system, we have to worry about exceeding levels, with an analog signal that typically results in distortion, which can be done on purpose for style. And exceeding levels in a digital system typically results in clipping a very harsh form of distortion, which sounds quite awful. You may recognize that from public announcement systems in parks or maybe trains. My amp has a volume knob which also shows decibels. To avoid that, we have to watch our levels. And over the past hundred years, we've come up with a bunch of solutions to do that, all of which are flawed in some way. In the 1930s, the BBC came up with meters that looked like this. Independently, and around the same time, the Germans also developed level meters, putting actual decibels on the scale and giving them the cute little nickname "Lichtschzeige instrument" by Pointer Instruments. Today we would call them both quasi-PPMs, PPM for Peak Program Meter and quasi because they don't actually report peaks accurately. Type 2 PPMs, for example, have an integration time of 10ms. Any peak shorter than that gets underreported. But quasi-PPMs are still pretty good at showing peaks. A lot more than, say, VU meters for volume units, which were invented in the US in the 40s and which get us a lot closer to loudness. What VU meters measure is similar to the

### Segment 2 (05:00 - 10:00) [5:00]

root mean square, which gives us the average level of a signal over a period of time. A typical VU meter has an integration time of 300ms, giving us even more severe underreporting of peaks. Which is fine. A VU meter is not meant to show short peaks. It's meant to let a radio operator know how loud a song roughly is so they can adjust the volume and the listeners don't have to. However, I'm happy to report that since the 1940s both our understanding of human hearing and technology have improved. First off, most audio processing is now done in the digital domain, which is both a blessing and a curse. Here's Ableton showing a piece of audio. We can zoom in and see the wave, we zoom in some more, and if we zoom enough, eventually Ableton will show us individual audio samples. These PPMs are not quasi anymore. It's really easy to make a sample peak monitor, because you just look at a window of samples, say, a thousand of them, and keep whichever value is the furthest from zero. And that's your peak. Your sample peak, not your true peak. Because the actual sound wave is reconstructed from a limited amount of discrete samples, it's possible for all samples to be below the maximum desired level, and yet for the reconstructed wave to be above it. To actually measure the true peak, one can use a sinc filter to upsample the signal, which means filling in additional samples between the original ones, letting us know how high that sound wave truly goes. And now, a word from today's sponsor. Curiosity Box is the subscription for thinkers. It's a really nice box. Each box turns learning into an adventure with premium science toys, hands-on experiments, and exclusive mind-expanding collectibles. Perfect for all ages, whether you're a lifelong science lover or just curious. For ages 14 to 4. 5 billion. Every box is unique, packed with limited edition, non-repeatable items you won't find anywhere else. But now I need like a nice clean light source. They're all about making science fun, engaging, and accessible. How many different colors of ball- Screw you! Because science isn't just a subject, it's a way of thinking. With each box, you get to explore fascinating topics through experiments and collectibles that challenge the way you see the world. I'm probably doing it wrong. Wait. So what are you waiting for? Use the link in description or promo code "FASTERTHANLIME" to get 25% off your first box. And remember to stay curious. So that takes care of peaks. What about loudness then? Well, we made progress there too. First, in the wrong direction. The idea of compression is to get rid of dynamic range by taking anything above a certain level and progressively making it smaller. We can apply gain to the resulting signal without clipping, making the whole thing louder. That gain is called "makeup gain" because it makes up for the loud bits that were made quieter by the compression. I can't believe that clicked for me only now. During the 2000s, sound engineers started abusing compression to make their albums sound louder and louder, based on the theory that people preferred louder things. Those loudness wars lasted until the mid-2010s, when the music industry finally tackled the problem by inventing a proper loudness unit, LKFS. The first interesting thing about LKFS is that it takes into account multiple channels and does a downmix into one value. Which begs the question, what did the BBC do with their PPMs? Well, they didn't have to worry about stereo until the late 50s, which is when they started experimenting with stereo with two separate AM transmitters. So two separate PPMs, one per channel, was an option. But they also had a different variant that showed the sum and the difference of both channels, which is important because if you have two waves of opposite phase, they cancel each other out. But the first stage of LKFS computation is filtering, to model how humans perceive sound. The first filter boosts everything above 1000 Hz, and the second is a high-pass filter, which attenuates anything under 100 Hz. Next, we integrate over some interval t to calculate the power of the filtered signal. Finally, this one should look familiar. It's very close to the decibel formula from earlier. But because this time we're measuring a power and not an amplitude, we use a 10x factor instead of a 20. Depending on the interval chosen to calculate loudness, we call the results different things. M for momentary, S for short-term, as for I, it's the integrated loudness, and it takes into account an entire piece of media minus the quiet parts using a standard gating mechanism. And that prevents any cheating done by audio engineers to make their sounds louder than the others, because we finally have one number that is relatively good at predicting how loud something is to the human ear. In YouTube's Stats for Nerds, we can find out how far various videos are from their mastering target of -14 LUFS. That George Michael song from 2009 is 2. 9 loudness units below. As a result, it's not altered during playback. Rihanna's Umbrella, also from 2009, is 5. 3 above, and so YouTube reduces the volume to 55%, a 5. 3 decibel decrease. Even more recently, YouTube introduced dynamic range compression, which can amplify videos mastered way too quiet. LKFS and LUFS, same thing, different name, aren't the only units that try to take psychoacoustics into account. The first experiments regarding how loud humans think sound is date back to 1927. A few years later, Fletcher and Manson published this equal contour loudness graph

### Segment 3 (10:00 - 10:00) [10:00]

which takes a minute to figure out. Each of the lines determine a level of loudness. Their test subjects reported that, for example, a 1000 Hz sound, blasted at 40 dB, felt as loud as a 100 Hz sound at 62 dB. In other words, we are much, much more sensitive to sounds at 1000 Hz than to those at 100 Hz. That dip, around 3-4 kHz, is where our hearing is most sensitive. We made our smoke detectors beep at that frequency for maximum alert, and our babies cry at that frequency for similar reasons. From that graph, an ISO standard was derived, specifying the A-weighting curve, which predates LKFS's K-weighting by almost 50 years. Although more basic and somewhat outdated, A-weighting is used in a bunch of places. French law requires sound level meters like these in every music venue. American work safety organizations give different recommendations when it comes to maximum sound exposure. These tables from OSHA and NIOSH use dBA, and so does the Apple Watch Noise app. Now that we know how all these units fit together, we can all be fun at the next party. Thanks for watching, and see you next time!