But what are Hamming codes? The origin of error correction

20:05

But what are Hamming codes? The origin of error correction

3Blue1Brown 04.09.2020 2 887 533 просмотров 100 962 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

A discovery-oriented introduction to error correction codes. Part 2: https://youtu.be/b3NxrZOu_CE Ben Eater:'s take: https://youtu.be/h0jloehRKas Help fund future projects: https://www.patreon.com/3blue1brown An equally valuable form of support is to simply share some of the videos. Special thanks to these supporters: https://3b1b.co/hamming-thanks Heavily related is the chessboard puzzle I did with Matt Parker: https://youtu.be/as7Gkm7Y7h4 You can read Hamming's own perspective on his discovery of these codes in chapter 12 of "The Art of Doing Science and Engineering". https://amzn.to/3lwcnmh The viewer Harry Li made an interactive on this topic: https://harryli0088.github.io/hamming-code/ Thanks to these viewers for their contributions to translations Hebrew: Omer Tuchfeld Hungarian: Fabó Bence Spanish: agustin-j ------------------ These animations are largely made using manim, a scrappy open-source python library: https://github.com/3b1b/manim If you want to check it out, I feel compelled to warn you that it's not the most well-documented tool, and it has many other quirks you might expect in a library someone wrote with only their own use in mind. Music by Vincent Rubinetti. Download the music on Bandcamp: https://vincerubinetti.bandcamp.com/album/the-music-of-3blue1brown Stream the music on Spotify: https://open.spotify.com/album/1dVyjwS8FBqXhRunaG5W5u ------------------ 3blue1brown is a channel about animating math, in all senses of the word animate. And you know the drill with YouTube, if you want to stay posted on new videos, subscribe: http://3b1b.co/subscribe Various social media links: Website: https://www.3blue1brown.com Twitter: https://twitter.com/3blue1brown Reddit: https://www.reddit.com/r/3blue1brown Instagram: https://www.instagram.com/3blue1brown_animations/ Patreon: https://patreon.com/3blue1brown Facebook: https://www.facebook.com/3blue1brown

Методичка по этому видео

Структурированный конспект

Коды Хэмминга: как исправлять ошибки в данных с минимальной избыточностью

Коды коррекции ошибок Хэмминга — как с помощью всего нескольких контрольных битов находить и исправлять ошибки в цифровых данных. Математическая элегантность защиты информации.

Оглавление (8 сегментов)

<Untitled Chapter 1>

Have you ever wondered how it's possible to scratch a CD or a DVD and still have it play back whatever it's storing? The scratch really does affect the 1s and 0s on the disk, so it reads off different data from what was stored, but unless it's really scratched up, the bits it reads off are decoded into precisely the same file that was encoded onto it, a bit for bit copy, despite all those errors. There is a whole pile of mathematical cleverness that allows us to store data, and just as importantly to transmit data, in a way that's resilient to errors. Well, okay, actually it doesn't take that much cleverness to come up with a way to do this. Any file, whether it's a video or sound or text, some code, an image, whatever, is ultimately some sequence of 1s and 0s. And a simple strategy to correct any bit that gets flipped would be to store three copies of each bit. Then the machine reading this file could compare these three copies and always take the best 2 out of 3 whenever there's a discrepancy. But what that means is using two thirds of your space for redundancy. And even then, for all of that space given up, there's no strong guarantee about what happens if more than one bit gets flipped. The much more interesting question is how to make it so that errors can be corrected while giving up as little space as possible. For example, using the method you'll learn about this video, you could store your data in 256-bit blocks, where each block uses 9 bits, 9(! ), to act as a kind of redundancy, and the other 247 bits are free to carry whatever meaningful message or data you want. And it will still be the case that if any bit gets flipped here, just by looking at this block and nothing more, a machine will be able to identify that there was an error and precisely where it was so that it knows how to correct it. And honestly, that feels like magic. And for this particular scheme, if two bits get flipped, the machine will at least be able to detect that there were two errors, though it won't know how to fix them. We'll talk a little bit later about how this scales for blocks with different sizes. Methods that let you correct errors like this are known, reasonably enough, as error correction codes. For the better part of the last century, this field has been a really rich source of surprisingly deep math that gets incorporated into devices we use every day. The goal here is to give you a very thorough understanding of one of the earliest examples, known as a Hamming code. And by the way, the way I'm thinking about the structure of this video is less about explaining it as directly as possible, and more a matter of prompting you to invent it for yourself, with a little gentle guidance here and there. So when you feel like you see where it's going at some point, take that moment to pause, actively predict what the scheme is going to be before I tell you. Also, if you want your understanding to get down to the hardware level, Ben Eater has made a video in conjunction with this one showing you how to

Ben Eater implementing Hamming codes

actually implement Hamming codes on breadboards, which is extremely satisfying. You should know, Hamming codes are not as widely used as more modern codes, like the Reed-Solomon algorithm, but there is a certain magic to the contrast between just how impossible this task feels at the start, and how utterly reasonable it seems once you learn about Hamming. The basic principle of error correction is that in a vast space of all possible messages, only some subset are going to be considered valid messages. As an analogy, think about correctly spelled words vs incorrectly spelled words. Whenever a valid message gets altered, the receiver is responsible for correcting what they see back to the nearest valid neighbor, as you might do with a typo. Coming up with a concrete algorithm to efficiently categorize messages like this, though, takes a certain cleverness. The story begins in the 1940s, when a young Richard Hamming was working for Bell Labs, and some of his work involved using a very big expensive punch card computer that he had only limited access to. And the programs he kept putting through it kept failing, because every now and then a bit would get misread. Frustration being the crucible of invention, he got so fed up that he invented the world's first error correction code. There are many different ways to frame Hamming codes, but as a first pass we're going to go through it the way Hamming himself thought about them.

Reinventing Hamming Codes

Let's use an example that's simple, but not too simple, a block of 16 bits. We'll number the positions of these bits from 0 up to 15. The actual data we want to store is only going to make up 12 of these bits, while 4 of the positions are reserved as a kind of redundancy. The word redundant here doesn't simply mean copy, after all, those 4 bits don't give us enough room to blindly copy the data. Instead, they'll need to be a much more nuanced and clever kind of redundancy, not adding any new information, but adding resilience. You might expect these 4 special bits to come nicely packaged together, maybe at the end or something like that, but as you'll see, having them sit in positions which are powers of 2 allows for something that's really elegant by the end. It also might give you a little hint about how this scales for larger blocks. Also technically it ends up being only 11 bits of data, you'll find there's a mild nuance for what goes on at position 0, but don't worry about that for now. Like any error correction algorithm, this will involve two players, a sender who's responsible for setting these 4 special bits, and a receiver who's responsible for performing some kind of check and correcting the errors. Of course, the words sender and receiver really refer to machines or software that's doing all the checks, and the idea of a message is meant really broadly, to include things like storage. After all, storing data is the same thing as sending a message just from the past to the future instead of from one place to another. So that's the setup, but before we can dive in we need to talk about a related idea which was fresh on Hamming's mind in the time of his discovery, a method which lets you detect any single bit errors, but not to correct them, known in the business as a parity check.

Parity Check

For a parity check, we separate out only one single bit that the sender is responsible for tuning, and the rest are free to carry a message. The only job of this special bit is to make sure that the total number of 1s in the message is an even number. So for example right now, that total number of 1s is 7, that's odd, so the sender needs to flip that special bit to be a 1, making the count even. But if the block had already started off with an even number of 1s, then this special bit would have been kept at a 0. This is pretty simple, deceptively simple, but it's an incredibly elegant way to distill the idea of change anywhere in a message to be reflected in a single bit of information. Notice if any bit of this message gets flipped, either from 0 to 1 or 1 to 0, it changes the total count of 1s from being even to being odd. So if you're the receiver, you look at this message

Noise

and you see an odd number of 1s, you can know for sure that some error has occurred, even though you might have no idea where it was. In the jargon, whether a group of bits has an even or odd number of 1s is known as its parity. You could also use numbers and say the parity is 0 or 1, which is typically more helpful once you start doing math with the idea. And this special bit that the sender uses to control the parity is called the parity bit. And actually, we should be clear, if the receiver sees an odd parity, it doesn't necessarily mean there was just one error, there might have been 3 errors, or 5, or any other odd number, but they can know for sure that it wasn't 0. On the other hand, if there had been 2 errors, or any even number of errors, that final count of 1s would still be even, so the receiver can't have full confidence that an even count necessarily means the message is error-free. You might complain that a message which gets messed up by only 2 bit flips is pretty weak, and you would be absolutely right. Keep in mind, though, there is no method for error detection or correction that could give you 100% confidence that the message you receive is the one the sender intended. After all, enough random noise could always change one valid message into another valid message just by pure chance. Instead, the goal is to come up with a scheme that's robust up to a certain maximum number of errors, or maybe to reduce the probability of a false positive like this. Parity checks on their own are pretty weak, but by distilling the idea of change across a full message down to a single bit, what they give us is a powerful building block for more sophisticated schemes.

Fundamental building block

For example, as Hamming was searching for a way to identify where an error happened, not just that it happened, his key insight was that if you apply some parity checks not to the full message, but to certain carefully selected subsets, you can ask a more refined series of questions that pin down the location of any single bit error. The overall feeling is a bit like playing a game of 20 questions, asking yes or no queries that chop the space of possibilities in half. For example, let's say we do a parity check just on these 8 bits, all of the odd numbered positions. Then if an error is detected, it gives the receiver a little more information about where specifically the error is, namely that it's in an odd position. If no error is detected among those 8 bits, it either means there's no error at all, or it sits somewhere in the even positions. You might think that limiting a parity check to half the bits makes it less effective, but when it's done in conjunction with other well-chosen checks, it counterintuitively gives us something a lot more powerful. To actually set up that parity check, remember, it requires earmarking some special bit that has control for the parity of that full group. Here let's just choose position 1. For the example shown, the parity of these 8 bits is currently odd, so the sender is responsible for toggling that parity bit, and now it's even. This is only 1 out of 4 parity checks that we'll do. The second check is among the 8 bits on the right half of the grid, at least as we've drawn it here. This time we might use position 2 as a parity bit, so these 8 bits already have an even parity, and the sender can feel good leaving that bit number 2 unchanged. Then on the other end, if the receiver checks the parity of this group and they find that it's odd, they'll know that the error is somewhere among these 8 bits on the right. Otherwise it means either there's no error, or the error is somewhere on the left half. Or I guess there could have been two errors, but for right now we're going to assume that there's at most one error in the entire block. Things break down completely for more than that. Here, before we look at the next two checks, take a moment to think about what these first two allow us to do when you consider them together. Let's say you detect an error among the odd columns, and among the right half. It necessarily means the error is somewhere in the last column. If there was no error in the odd column but there was one in the right half, that tells you it's in the second to last column. Likewise if there is an error in the odd columns but not in the right half, you know it's somewhere in the second column. And if neither of those two parity checks detects anything, it means the only place that an error could be is in that leftmost column. But it also might simply mean there's no error at all. Which is all a rather belabored way to say that two parity checks let us pin down the column. From here, you can probably guess what follows. We do basically the same thing but for the rows. There's going to be a parity check on the odd rows, using position 4 as a parity bit. So in this example that group already has an even parity, so bit 4 would be set to a 0. And finally there's a parity check on the bottom two rows, using position 8 as a parity bit. In this case, it looks like the sender needs to turn that bit 8 on in order to give the group even parity. Just as the first two checks let us pin down the column, these next two let you pin down the row. As an example, imagine that during the transmission there's an error at, say, position 3. Well this affects the first parity group, and it also affects the second parity group, so the receiver knows that there's an error somewhere in that right column. But it doesn't affect the third group, and fourth group. And that lets the receiver pinpoint the error up to the first row, which necessarily means position 3, so they can fix the error. You might enjoy taking a moment to convince yourself that the answers to these four questions really will always let you pin down a specific location, no matter where they turn out to be. In fact, the astute among you might even notice a connection between these questions and binary counting. And if you do, again let me emphasize, pause, try for yourself to draw the connection before I spoil it. If you're wondering what happens if a parity bit itself gets affected, well, you can just try it. Take a moment to think about how any error among these four special bits is going to be tracked down just like any other, with the same group of four questions. It doesn't really matter, since at the end of the day what we want is to protect the message bits, the error correction bits are just riding along. But protecting those bits as well is something that naturally falls out of the scheme as a byproduct. You might also enjoy anticipating how this scales. If we used a block of size 256 bits, for example, in order to pin down a location, you need only eight yes or no questions to binary search your way down to some specific spot. And remember, each question requires giving up only a single bit to set the appropriate parity check. Some of you may already see it, but we'll talk later about the systematic way to find what these questions are in just a minute or two. Hopefully this sketch is enough to appreciate the efficiency of what we're developing here. The first thing, except for those eight highlighted parity bits, can be whatever you want it to be, carrying whatever message or data you want. The 8 bits are redundant in the sense that they're completely determined by the rest of the message, but it's in a much smarter way than simply copying the message as a whole. And still, for so little given up, you would be able to identify and fix any single bit error. Well, almost. Okay, so the one problem here is that if none of the four parity checks detect an error, meaning that the specially selected subsets of 8 bits all have even parities, just like the sender intended, then it either means there was no error at all, or it narrows us down into position 0. You see, with four yes or no questions, we have 16 possible outcomes for our parity checks, and at first that feels perfect for pinpointing 1 out of 16 positions in the block, but you also need to communicate a 17th outcome, the no error condition. The solution here is actually pretty simple, just forget about that 0th bit entirely. So when we do our four parity checks and we see that they're all even, it unambiguously means that there is no error. What that means is rather than working with a 16-bit block, we work with a 15-bit block, where 11 of the bits are free to carry a message and 4 of them are there for redundancy. And with that, we now have what people in the business would refer to as a 15-11 Hamming code.

(15, 11) Hamming code

That said, it's nice to have a block size that's a clean power of 2, and there's a clever way we can keep that 0th bit around and get it to do a little extra work for us. If we use it as a parity bit across the whole block, it lets us actually detect, even though we can't correct, 2-bit errors. Here's how it works. After setting those four special error-correcting bits, we set that 0th one so that the parity of the full block is even, just like a normal parity check. Now, if there's a single bit error, then the parity of the full block toggles to be odd, but we would catch that anyway thanks to the four error-correcting checks. However, if there's two errors, then the overall parity is going to toggle back to being even, but the receiver would still see that there's been at least some error because of what's going on with those four parity checks. So if they notice an even parity overall, but something non-zero happening with the other checks, it tells them there were at least two errors. Isn't that clever? Even though we can't correct those 2-bit errors, just by putting that one little bothersome 0th bit back to work, it lets us detect them. This is pretty standard, it's known as an extended Hamming code.

Extended Hamming Code

Technically speaking, you now have a full description of what a Hamming code does, at least for the example of a 16-bit block. But I think you'll find it more satisfying to check your understanding and solidify everything up to this point by doing one full example from start to finish yourself. I'll step through it with you though so you can check yourself. To set up a message, whether that's a literal message you're translating over space or some data you want to store over time, the first step is to divide it up into 11-bit chunks. Each chunk is going to get packaged into an error-resistant 16-bit block. So let's take this one as an example and actually work it out. Go ahead, actually do it! Let's pause and try putting together this block. Okay, you ready? Remember, position 0 along with the other powers of 2 are reserved for error correction duty, so you start by placing the message bits in all of the remaining spots, in order. You need this group to have an even parity, which it already does, so you should have set that parity bit in position 1 to be a 0. The next group starts off with an odd parity, so you should have set its parity bit to be 1. The group after that starts with an odd parity, so again you should have set its parity bit to 1. And the final group also has an odd parity, meaning we set that bit in position 8 to be a 1. And then as the final step, the full block now has an even parity, meaning you can set that bit number 0, the overarching parity bit, to be 0. So as this block is sent off, the parity of the four special subsets and the block as a whole will all be even, or 0. As the second part of the exercise, let's have you play the role of the receiver. Of course, that would mean you don't already know what this message is, maybe some of you memorized it, but let's assume that you haven't. What I'm going to do is change either 0, 1, or 2 of the bits in that block, and then ask you to figure out what it is that I did. So again, pause and try working it out. Okay, so you as the receiver now check the first parity group and you can see that it's even, so any error that exists would have to be in an even column. The next check gives us an odd number, telling us both that there's at least one error, and narrowing us down into this specific column. The third check is even, chopping down the possibilities even further. And the last parity check is odd, telling us there's an error somewhere in the bottom, which by now we can see must be in position number 10. What's more, the parity of the whole block is odd, giving us confidence that there was one flip and not two. If it's three or more, all bets are off. After correcting that bit number 10, pulling out the 11 bits that were not used for correction gives us the relevant segment of the original message, which if you rewind and compare is indeed exactly what we started the example with. And now that you know how to do all this by hand, I'd like to show you how you can carry out the core part of all of this logic with a single line of Python code. You see, what I haven't told you yet is just how elegant this algorithm really is, how simple it is to get a machine to point to the position of an error, how to systematically scale it, and how we can frame all of this as one single operation rather than multiple separate parity checks. To see what I mean, come join me in part 2.

Другие видео автора — 3Blue1Brown

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник