For example, how do you know that every time you're rolling over to the eights place, that disk 3 is necessarily going to be freed up to move? At the same time, the solution just immediately raises these questions like, where does this come from, why does this work, and is there a better way of doing this than having to do 2 to the n minus 1 steps? It turns out, not only does this solve Towers of Hanoi, but it does it in the most efficient way possible. Understanding why this works and how it works and what the heck is going on comes down to a certain perspective on the puzzle, what CS folk might call a recursive perspective. Disk 3 is thinking, okay, 2, 1, and 0, you have to get off of me. I can't really function under this much weight and pressure. And so just from disk 3's perspective, if you want to figure out how is disk 3 going to get over here, somehow, I don't care how, disk 2, 1, and 0 have to get to spindle B. That's the only way it can move. If any of these disks are on top of 3, it can't move. If any of them are in spindle C, it can't move there. So somehow we have to get 2, 1, and 0 off. Having done that, then we can move disk 3 over there. And then disk 3 says, I'm set. You never need to move me again. Everyone else just figure out how to get here. And in a sense, you now have a smaller version of the same problem. Now you've got disk 0, 1, and 2 sitting on spindle B, you've got to get them to C. So the idea is that if I just focus on one disk and I think about what am I going to have to do to get this disk to work, I can turn my bigger problem into something slightly smaller. And then how do I solve that? Well, it's exactly the same thing. Disk 2 is going to say, disk 1, disk 0, it's not you, it's me. I just need some space. Get off. They need to move somewhere. Then disk 2 can move to where it needs to go. Then disk 1 and 0 can do this. But the interesting point is that every single disk pretty much has the exact same strategy. They all say, everybody above me, get off. Then I'm going to move, OK, everyone pile back on. When you know that insight, you can code up something that will solve Towers of Hanoi, like five or six lines of code, which probably has the highest ratio of intellectual investment to lines of code ever. And if you think about it for a bit, it becomes clear that this has to be the most efficient solution. At every step, you're only doing what's forced upon you. You have to get disk 0 through 2 off before you can move disk 3. And you have to then 0 through 2 back onto it. There's just not any room for inefficiency from this perspective. So why does counting in binary capture this algorithm? Well, what's going on here is that this pattern of solving a subproblem, moving a big disk, then solving a subproblem again, is perfectly paralleled by the pattern of binary counting. Count up some amount, roll over, count up to that same amount again. And this Towers of Hanoi algorithm and binary counting are both self-similar processes, in the sense that if you zoom out and count up to a larger power of 2, or solve Towers of Hanoi with more disks, they both still have that same structure. Subproblem, do a thing, subproblem.