I built the most expensive CPU ever! (Every instruction is a prompt)
21:51

I built the most expensive CPU ever! (Every instruction is a prompt)

Yannic Kilcher 08.11.2023 24 061 просмотров 1 127 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
#gpt4 #promptengineering #compiler I have built the most expensive CPU ever by compiling arbitrary C code to LLVM intermediate representation (LLVM-IR), then sending every single instruction to OpenAI's GPT API. A true marvel of engineering. Check out course: https://wandb.me/yannic-course Repo: https://github.com/yk/llmvm OUTLINE: 0:00 - Intro 0:55 - Sponsor Time 2:20 - FizzBuzz 4:05 - Computers 101 6:30 - Building a CPU on top of a large language model 7:40 - Compilers & LLVM 10:05 - From parsers to virtual machines 12:10 - GPTVM - A VM powered by GPT 15:55 - Snek! 18:40 - Going beyond AGI - Introducing Chad GPT VM Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Оглавление (10 сегментов)

Intro

Andre has posted this tweet a while back he says that with the many puzzle pieces dropping recently we should consider llms not as a chatbot but the kernel process of a new operating system you know it can orchestrate all of these things and it can even act as all of these things it can act as an operating system it can orchestrate databases and so on but I want to say Andre an operating system is nothing your creativity limits you thinking inside a box what we're doing here we go beyond and we Implement an entire CPU in GPT so I can run any program that I want I have a compiler and through that compiler and I can then execute it against any language model in the background let me show you how this works I have a deal for you

Sponsor Time

listen to me for 30 seconds to talk about today's sponsor weights and biases and by doing that I'm happy you're happy weights and biases is Happy everyone's happy if that sounds good to you then here is the deal the course training and fine-tuning large language models has been created by weights and biases and is completely free this is a free course about well training and fine-tuning large language models the lineup of instructors they have as you can see right here is sweet like these are seriously good people and you might recognize one or the other the current machine learning landscape the course is done together with Mosaic and contains 37 lessons now already said it's about training and fine-tuning large language models but the column here in the middle is what I'm personally most excited about curate the data set and establish an evaluation approach from practice and I happen to be in more in Industry more in practice now I can tell you that yeah fine-tuning is flashy impr prompting is flashy but to even be able to measure quality of something like a language model especially one you've trained yourself is among the most crucial and among the most demanded uh skills there are thanks to weights and biases for

FizzBuzz

sponsoring this video and let's go on hello everyone look at this program right here I'm going to run it it's called fizzbuzz and it's often used in programming interviews use the idea is you iterate from a number from0 to 30 in this case and every time the number is divisible by three you output the word Fizz every time it's divisible by five you output the word buzz and every time you output it's divisible by both you output the word fbus otherwise you just output the number and you can already see there is an output we have an output fizzbuzz now one is the correct next output because that's not divisible by three or five and we hope the next output is going to be two um let's wait and the excitement is palpable too there it is and the only thing I wish for now is a little bit of Fizz for the next output fizz there we go that's it yo so you may think what's the matter it's like the simplest C program ever and I'm making a big deal about so you know what is it let me show you let me clear this and turn on debug output all right so if I a whole different picture appears look at that it's going it's doing all of this is happening and until you actually see a print output uh there's going to be a long time so what you're seeing right here are fiz bus there we go machine instructions getting executed against a processor usually the way a computer works is that

Computers 101

you have what's called code and code is a set of instructions it's loaded into memory so there are instructions and there is a program counter that just kind of runs through these instructions and every code does something uh so you have the central thing the CPU and the CPU takes a piece of code and it's supposed to do computations so it has these things called Reg registers let's call them buckets so these are registers and it has memory and it has input output and so on all of this available so the CPU gets an instruction for example uh move some number from register number four to register number eight so what it will do is it'll take it out into from register 4 and put it into register 8 it can also do some something like how about you compare so that's a really interesting instruction how about you compare CMP register 4 to the value one okay and you store the result in register 2 so it'll go to register 4 it compare it with the value one and it will store either a zero or a one in register number two whatever that is and then the next point the next uh instruction here could be something like branch or conditional Branch um depending on register 2 and if you know if register 2 is true then go to instruction number four otherwise eight so we're here we look at register number two and if that is true right if before this was register 4 was equal to one we're going to jump to one instruction otherwise we're going to do jump to another instruction instuction this is how at the lowest level your computer implements things like if conditions uh you can do math here you can sum two things together you can you know Jump Around control flow loops and so on all of this is implemented at the lowest level by instructions to the CPU now okay what does this have anything to do with this and why is this video called I built the most expensive CPU ever um you might already guess what's coming right

Building a CPU on top of a large language model

now in fact all of the instructions that you see right here they are in fact CPU instructions and each one of them is getting executed not by the actual CPU in your system but by GPT simulating a CPU so I can run any program that I want I have a compiler and through that compiler and I can then execute it against GP against any language model in the background let me show you how this works and this really is the most expensive CPU ever it takes hundreds of instructions just to get to the first fizzbuzz and every time the an instruction is executed I send a whole bunch of stuff to GPT and I get an answer back and that's one machine instruction we've come full circle we've built CPUs uh we have trained huge language models on the and now these language models can act as CPUs so fundamentally when you write computer code uh you at some point need to make it into machine instructions

Compilers & LLVM

which means you need to compile it in some way now granted languages like Python and uh JavaScript they do a different thing but if you write a c program like this something has to take it and compile it and that's usually a compiler now what the compiler does the parts you may know is it's going to turn it into assembly code so GCC can do this for you and if you look at the generated assembly code you can see much more of what's going on this here is FS buuz compiled for the N x86 architecture and you can see all the little operations right here we first push some stuff we add some stuff we move some stuff and eventually we'll compare some stuff and we'll jump around depending on the results of those comparisons we multiply things here which uh certainly has something to do with the fact that we're Computing remainders of multiplications and so on of Divisions and all of this you might be familiar with now if you're using something like Mac OS uh like I'm using right here the actual compiler you may use isn't necessarily GCC um but a compiler called clang cang I'm not exactly sure how it does it now that has a llvm tool chain in the background and that essentially means llvm is a tool chains for many compilers not only C but C++ any kind of uh language can Implement an llvm pre-compiler let's say to turn its program into llvm instruction and then llvm can take care of really outputting the assembly or the machine code for a given architecture so llvm is this unified thing and if we look at that it's actually much more readable so here is the llvm output the llvm intermediate representation for fs buuz and this looks like something one can actually understand assembly still has a lot of high registers low registers and blah blah I don't want to deal with any of that but this here this seems doable so what do we need to do in order to achieve our goal of building a GPT CPU first of all we need to parse this program and I've done that so if

From parsers to virtual machines

you look in the parsing file I've written a parser to parse all of the relevant instructions right here you can see signed remainder addition you can see right here Store Branch to labels and so on so I've written a parer that takes LM intermediate representation and builds essentially a an abstract representation of that I can then use to continue the next thing I did is I built a reference implementation of a CPU that means I have built something that in just pure python code emulates the CPU so you can see right here I have constants which are usually defined at the beginning of the program I have registers and I have memory and these I'm kind of cheating a little bit right here see usually things are way more compc complicated than I'm doing this right now but whenever possible I'm just kind of storing values in the registers and not pointers to them and then I have to trick around a bit when it comes to arrays and things like this in any case I'm emulating a CPU right here with python and the idea is just that I need to make sure that my whole parsing setup works and so you can see right here I am going through the operations and I'm kind of implementing them with python addition easy multiplication that's that array lookup certainly and all of these types of things are just implemented in terms of pure python code we can actually run this right now so if I actually go back here and I use the reference uh implementation you can see bisbas in all its glory and um I hope that's correct at least it probably is so this gives me confidence that my whole parsing and so on is correct and it obviously runs way faster because I'm just kind of emulating a CPU in Python however what I did next was Will shock you so next I

GPTVM - A VM powered by GPT

implemented a VM based on GPT now in this VM I'm still managing the registers myself so the registers are just going to be a dict um that I manage however uh now every single operation in here is a prompt for example if I need to get something from registers what I'll do is I'll send the prompt look at the registers and then I'll send the full registers and I'll say now get me the value of a certain index in the register it's up to GPT to look up the correct thing in that deck and give it back to me I have addition prompts I have remainder prompts I have comparison prompts as you can see right here even system prompts you're a very keen Observer all right let's dive into this let's look at this block right here uh block with the label number six so the first thing it does is it actually loads uh from the pointer number two in the registers uh into a value into number seven so into register 7 these are abstract registers for now but bear with me um we see right here that in the debug log this is from the program uh code what we need to do this is my parsing of this line now I sent this to GPT I say look at these registers right here so I have registers number one is zero number two is zero and so on these are registers that I manag but I just sent them to GPT as a string and tell it get me the value of number two so this corresponds to this line right here in the code and the system correctly says that's zero okay so the next thing is we need to assign this to the register zero all right um assigning in this VM is simply me putting it into the registers however in a later iteration which I'll show you later even that is done by GPT so the next thing is we want to compute the remainder of the register number seven where we just stored something with the number three all right so the first thing we need to do is we need to read from the registers so I'm again going to send the registers to GPT and they get me the value of register number 7 it says that's zero that is correct that is indeed zero as you can see right here then I let it compute the remainder so get the remainder value of 0 / 3 again it says 0 that's correct assign that to number eight that's up here in the code and then we'll compare the next instruction is a comparison an integer comparison between 8 and zero again we look at the registers we get the value of number eight it says correctly that's zero and we execute the comparison operation are Z and Z equal you're a very keen Observer and we get the answer yes so let's assign that to register number n no we can't control what GPT says it could be like yes that's true or anything like this whatever it says we'll just store it into the register now further down we're going to do a conditional jump so look at the registers get me the Val of number nine and I have a bit of a truthiness conversion right here to be a bit robust and I do a branching jump right here but you can see that for given instruction right here there's either going to be one or even multiple GP uh sorry GPT instructions that go along with it and thereby this is truly the world's most expensive and most faulty probably CPU so let me show you a

Snek!

bit of a different program right here and it's called snake and if I execute this on a uh on this you can see it's a very basic form of snake so I can move the little o around until it kind of eats the uh eats the star you can see it works and if I eat it I actually win it's not a complicated program but it's quite a bit more complicated than the fs buuz program as you can see right here but I didn't just Implement FSB right we have a full we have a fully fledged compiler from anyc program to G to running it on GPT and that's exciting so all we need to do you know mod modular sum UPS I haven't implemented all we need to do is we need to run this through llvm get the intermediate representation and then run it through my parser and then we can fully play Snake on GPT and how awesome is that who before has played Snake on GPT I know the next iteration is Doom but so if I run this now this is going to go and go you can see how the registers get slowly filled up with values coming back from gbt and uh we're doing comparisons we're doing all kinds of math we're doing additions and remainders and you can see it kind of screwing up uh the answer is one is here in the registers instead of just the number one and that will flow into next prompts and so on so this is an extremely faulty system it's extremely inefficient it's extremely wasteful like we're using the world's most powerful language models um but we're doing it in service of playing snake and I think if anything that's a cause that's worth it in fact it's going to take about 600 instructions until I can do the first input in this program conditioned on it never failing because it will just randomly decide that now the number two is actually greater than the number 30 um and oh look here it made an array that's certainly good I'm pretty sure that's good um yeah so you'll see it has a few faults and uh sharp corners but you know in general it kind of works and not only that but I've gone further than this see not only have I made a basic VM that just executes the instructions I have also made one that even manages the

Going beyond AGI - Introducing Chad GPT VM

registers itself so instead of the registers being a dict the registers are not just a string that I initialize with an empty dict but it has to do all of the stuff by itself and as an extension of that I've implemented this beauty right here so I call it Chad GP TVM and the prompt is kind of always the same it's look bro I know you love to talk but I really just need the answer please don't explain your you have to I'm really not good at prompting uh but get from registers is implemented like this bro look at these rad registers bro for real just tell me the value of register something addition is implemented as yo what's the s of this and that bro for real do you know how remainders and stuff works I really need the remainder value of that my man you mag good at multiplication and you know this is it's a beauty to watch this work so let's do that I can't I literally can't wait anymore all right this is now considerably slower because not only am I asking it for you know any particular uh register value but I always send the full state of the registers and I get back now nothing constrains it to uh make a dick and send me back a dick it could send me back whatever it wants sometimes actually it does that sometimes uh it just kind of sends back something else um it just puts some text around it or gives me some friendly comments so you see uh my main man I need you to store the value zero in register 9 and um these are the current registers and it gives me the result again has nine here but also nine here so um it just kind of doubled up the registers right here and um yeah that's certainly not good so I guess register 2 here kind of holds another dick with the registers that's kind of closed here so you know like this CPU it might need some work watch out tsmc we're coming for you uh you will not be able to compete right here Nvidia stock goes this is not Financial advice Nvidia stock boom tsmc stock boom Chad GPT stock through the roof this just shows if you set yourself a goal you can achieve anything you want with that being said uh the code here is on get up do with it whatever you want uh it's a giant waste of money and time but you know you can play Snake and what else is there that was it for me thank you very much I'll see you around bye-bye

Другие видео автора — Yannic Kilcher

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник