Finding Hardware Bugs - Computerphile

18:09

Finding Hardware Bugs - Computerphile

Computerphile 29.04.2026 35 777 просмотров 1 256 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

When you're setting your hardware design out using automated tools is essential, but what if the tools themselves have bugs in them? John P Wickerson is based at Imperial College London. The paper that John's team wrote about the work: https://johnwickerson.github.io/papers/fuzzing_pnr.pdf An accessible blogpost John wrote about the work: https://johnwickerson.wordpress.com/2026/01/06/finding-bugs-in-fpga-place-and-route-tools/ Computerphile is supported by Jane Street. Learn more about them (and exciting career opportunities) at: https://jane-st.co/computerphile This video was filmed and edited by Sean Riley. Computerphile is a sister project to Brady Haran's Numberphile. More at https://www.bradyharanblog.com

Оглавление (4 сегментов)

Segment 1 (00:00 - 05:00)

What uh my research team and I have been most interested in the last few years is trying to make EDA tools more reliable. So by EDA tools I mean these are the tools that uh hardware engineers use to make hardware. The basic uh kind of flow would be your hardware engineer would come up with a design for what they want uh the functionality of your hardware uh to be. uh and they'll agonize over making sure it's all correct and um it's efficient and all these good things and then they will feed that human readable design into these EDA tools. So EDA is electronic design automation and they uh they have fancy names like elaborators and synthesis tools and place and root is one of them. So, it's kind of a sequence of tools and eventually what comes out at the far end um is something that could literally go on uh a piece of hardware like we have here. So, this is uh so underneath this big heat sink is an FPGA. So, it's a computer chip that you can implement whatever design you like. FPGAs sit in a weird uh in between state between you've on the one hand you've got sort of traditional computer chips where their functionality is completely fixed at the time that they're manufactured. Mhm. — Uh so they might be called AS6. So application specific integrated circuit. Um on the other hand, you might have um a processor which is a sort of general purpose processor CPU or something and that can do whatever you like because you just need to feed it a series of instructions. — The right software. Exactly. And it will do whatever you like. The FPGAs are sort of in between that because they are it is it's just a computer chip but it's set up in such a way that you can change what those uh individual gates on the chip do um by feeding it a new uh implementation if you like. Um so uh yeah so it's sort of in between — and that implementation c could you use the same FPGA for two different things or does it get set at some point? Is there like a point where you go I'm going to make this do that and now it is that — so the FPGA stands for field programmable gate array. So array of these logic gates field as in out in the field it can be reprogrammed. — Okay. So, and it only takes, you know, a fraction of a second to reprogram it, and you can reprogram it any number of times you like. The goal of these EDA tools is um yeah hardware designer has what they want to build pass it through all of these tools and it will end up as at the end of the day it's a sequence of zeros and ones that you feed in through one of these ports on this board that will end up configuring that chip — okay — to do what you want it to do. — Okay. And what we've been interested in is can we trust those tools to do the right thing — because as I said you know the process is that this hardware design is hard um and the designer will spend ages agonizing over uh you know getting the maximum possible performance and uh everything is working correctly in all of the possible edge cases and then they have to offload load that design into this sequence of tools. — Mhm. — At which point, you know, it's a it's like hopefully these tools are preserving what I designed so that what actually ends up on there is correct. — When you say trusting, do you mean trusting in terms of kind of nefarious activities or do you mean just trusting in terms of hopefully this will do what I want it to do? — I mean the latter. I mean that if the EDA tool has a bug in it that could cause bugs in your tool. It's like proliferation of bugs in a way. Um if there's a bug in the tool itself rather than in anything you've done that's a nightmare to debug. What we set out to do was um try to find bugs in these EDA tools. And the the tool that we were particularly focusing on in the project that I to talk about today um was the place and route tools. So this is the stage in the hardware design where you have maybe I'll start I'll try drawing something. Um so you have uh your design for your hardware. So perhaps just a very simple version. Maybe I have some sort of gate like that. And maybe I have

Segment 2 (05:00 - 10:00)

a inverter and maybe one more gate over here. And maybe my design says right what I want the computation to do is uh connect that to that. Maybe this is going to come around like that. And these two are going to be fed by that. Maybe that's my hardware design. Okay. Now I give that to the place and route tool. And what that tool will do is it will say uh right. So this gate here I'm going to put that uh maybe uh here on the uh on the FPGA. Uh this orgate maybe I'm going to locate that one there. And this inverter I think there's a good space for that just there. So that's the placing. And then the routting would be right. I've got to make sure that uh this component on the FPGA I've activated all of the routting machinery so that links up to that and this bit better link rounds to that in the correct way and I've got to mirror that connection like that that's got to connect to an input. So the result of the place and route might look something like that and we set out to try and find out does the tool always do this transformation correctly. We used a technique called fuzzing. So uh it's a way of testing where you are basically generating a whole load of random inputs. So what we did was we came up with a program that would produce random hardware designs something like this just like what I drew here where I just put some random gates in. I connected them in a random way but magnify that up. Pretty big quite complicated. There are a few subtleties to doing that because you've got to make sure that there are certain restrictions on um you don't want to go having uh certain kinds of loops would cause problems. Probably what I've drew here for instance is I should that doesn't look very good. It looks a bit too many cycles there. You got to watch out for that kind of thing. But yeah, what we then have is a whole load of basically meaningless hardware designs but superficially valid. feed them into the tool, see what comes out, and compare what comes out with what went in using what's called an equivalence checker. Um, which itself is another EDA tool that people use. And if the equivalence checker says actually that's what's come in, what's come out is not the same as what came in. Um, then we investigate and then we say, "All right, I think we might have found a bug. " — Okay. on that investigation you have to then get into the you know the source code of whatever's doing the placing and the routting. Wow. Yeah. So, this is one of the things about the hardware design uh industry is it compared to um more software uh side mostly it's all commercial. It's all closed source. There's no chance unless we actually worked for these big chip companies, we're not looking at their source code. So the best we can realistically do once we have a case where we've got some hardware design here that we can see is producing something that's not equivalent. Um best we can really do is just try and understand what is like what's the crux of the problem here. Um — it's like can we minimize that design? Chop away all of the bits that are not relevant to triggering the bug and end up with a really small design. Maybe it's just got a handful of gates that still triggers the bug — because that is that's our best attempt at understanding the the root of the bug and we can file that as a bug report to the vendors. So this is a slightly simplified version of uh a bug that uh we found. Um so what the design had in it was so the main component was it's called a lookup table um which is basically uh a table has a whole bunch of rows in it uh and each row has some data and the way that you can use this lookup table is you have some uh input wires coming in that say which row of the table I would like uh to come out on this output wire. Um I think they in this particular case there were maybe about four input wires and again in this randomly generated uh design what we had was we had this input was fixed at uh a zero. That input was fixed also at zero and that one was a zero as well. And here we had a zero that was passing through an inverter. So becoming — like a not gate. — A not gate. Yes. So we had 0 0 1 — which meant that we were requesting row one of this table to come out. — Which is this row here assuming that the

Segment 3 (10:00 - 15:00)

first row is row zero. So that was producing 0 the output y. Um so we gave a design basically like this with a couple of extra components here and there but these were the main ones. We gave this design to the place and root tool. And what tool did was during placing and routting it had an idea at least this is how we told it among ourselves. Um it had an idea that it doesn't need this inverter because it looked at the contents of this lookup table and it said well look row one which is the one you're asking for that's actually the same contents as row zero. I wrote these zeros and ones quite carefully. — Ah, okay. So, doesn't matter if it's zero or one. It's going to give the same result in that instance. — Exactly. So, how about I just get rid of this inverter and just put a the zero here. That will send out this row instead of this row, but what comes out will be exactly the same. And I'm using fewer gates. So, you know, better energy efficiency, smaller circuits. This is all good, right? So, the place root tool thought it was being very clever with this. Unfortunately, it had neglected that this was a slightly special lookup table which has an additional couple of ports on the top of my diagram. Um, this port uh here, if you set that high, then you're saying I would like to reconfigure the contents of this table. So um you set the way you do it. You set that wire to go high and then this wire here. You feed in a sequence like 0 1 1 0 etc. — And those sequence of zeros and ones will replace whatever's in this — all these entries in the table. — It's dynamic the lookup table. — Exactly. That's exactly what you call it. It's a dynamic lookup table — or dynamically reconfigured lookup table is the fancy name for it. this uh placement group tool had failed to notice um is that this uh wire here was constantly set to high which meant that whatever was coming in on this wire was constantly repopulating this table with well whatever was whatever this wire was connected to on some external input which means that although the initial contents of the table was such that inverter could be removed once new data has come in you can't make that same assumption so it was completely wrong to remove that inverter and that was the crux of the buzz. I think there is plenty of scope to continue studying these uh EDA tools and that there are lots of EDA tools and I think there's plenty of scope for developing new techniques to find more bugs and more tools. You certainly could carry on doing this uh this kind of work for a while. But the other thing that we've been looking at kind of concurrently with this is saying well you know is it possible to build different EDA tools that do not have bugs in which on the face of it maybe seems a bit ambitious. Um but there is um uh there are these techniques uh called formal verification or foundational verification and it it's basically about saying can we make a mathematical proof that this piece of software and ultimately the EDA tool is just a piece of software. um can we make a mathematical proof that software does what it's supposed to do which is you know always preserve your design in this case um and you prove it just like a mathematical theorem um you know the story I the way I um talk about this often is you imagine you know a mathematician who has um come up with a new proof of I don't theat theorem or some famous theorem and they present their uh their proof and what it consists of is um oh well I tried out the theorem on a whole loads and loads of numbers and it worked uh every time uh it it the two sides lined up the other mathematicians would laugh at them that doesn't count. It has to be a general watertight argument that works for all possible numbers. Um well you know the state of building software is basically we know we write the software we run loads and loads of tests through it see that it seems to be giving the right result and call it a day. Um what the world of formal verification is saying is well can we take those ideas from math and actually make a theorem about this piece of software that says it really will always do the right thing. So this is what we've been trying to do uh for the EDA tools. So we've got um so one project from a couple of years ago was building a verified synthesis tool. Um we're currently working on building a verified equivalence checker. That was the EDA tool that would check your

Segment 4 (15:00 - 18:00)

design before and after some transformation. Um so the and if that all works out then all this bug finding uh techniques they should become a bit more redundant — with maths let's be honest you know these theorems or things they're trying to prove there could be thousands of bits of computation in a program how can you that all interrelate and connect together surely it's an impossible task compared to say mathematics and that's you know me bit not being down on mathematics I just think there's you know there's just so much more to it. — Yes. No that's very completely correct. Um so Fat's last theorem for instance uh my example just then um exceedingly concise statement but incredibly deep and you know takes enormous expertise to be able to prove that kind of thing. And you're right that any substantial piece of software is just inherently well, you know, it's big. And so if you imagine turning that into a mathematical theorem, you're certainly not going to get anything that fits on a mathematician's blackboard. But it tends not to be anywhere near as deep. It's big because that whole theorem has got to capture the source code of the program, the how the programming language in which you've written the program works. the com something about the computer that you're running that program on, you've got to capture all of this. Imagine writing that out as a theorem is is ridiculous. Um, but yeah, the hope is that then when it comes to actually proving that theorem, you know, nothing's easy, but it's not the kind of uh it's not like proving the last theorem. And indeed, um, because so and the way you can get around the fact that it's so cumbersome is you use a computer. Maybe you saw this coming, but um — I was hoping that was where we're going — indeed. So we you have nowadays these um they're basically programming languages for proof. — Yeah, I think we've used some of them on the channel Lean and things like this before. — Indeed. So you've got lean, you've got rock, you've got Isabel, and a few others. So I think of these as programming languages for doing mathematical proofs. Um, and they're programming languages in the sense that it means that the computer can understand the proof and the computer so it can it means the computer can help you write the proof at least fill in the boring details. You know, if you've got some massive case split you need to do, the computer can help you generate all of those cases. Um, and it can also check the proof for you. Um, so yeah, that that's I think ultimately where we're heading. The fuzzer is going to generate some program. So what's in this program is going to be randomly determined by the fuzzer. However, the program is going to be an entirely deterministic program. We can give this program to GCC. We can give it to Clang.

Другие видео автора — Computerphile

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник