We are in a Race To Understand AI | Eric Ho, Goodfire

We are in a Race To Understand AI | Eric Ho, Goodfire

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI

Оглавление (9 сегментов)

Segment 1 (00:00 - 05:00)

There's something not quite right about the way that we're building AI models. We have pretty much every single CEO of all the big AI labs talking about AGI and these super intelligent systems. We just don't understand, we can't control, we can't steer, we can't shape and mold as our own. And it just seems deeply irresponsible to me to be able to deploy all of these systems and mission critical contexts without a deeper understanding of kind of how these models work. — How urgent is the need really? Or what what makes it so urgent? What are we risking here if we don't figure this out? — Never before has there been such a big gap between how widely a technology is being deployed and our understanding of this technology. We can extract a lot of concepts from models right now, but it's also just a really hard problem. I wake up every single day and I think about interpretability. — What would you say makes this team so special to you? Pretty much everyone, every researcher gets offers from Anthropic, Deep Mind, OpenAI, and they choose to come here and take a pay cut to come here. In many ways, I think of what we're doing is building a bridge between an AI mind and ourselves. That's the path that we're going to take. No matter what, we're going to make this happen. Hi everyone, I'm Nambdulum, a partner at Lightseed and this is the investment memo, the show where founders reveal the stories behind the businesses they are building and how they will shape the future of their industries. Today I'm joined by Eric Ho, the co-founder and CEO of Goodfire, a company redefining how we understand and steer the behavior of large AI models. Goodfire is a pioneer in the field of interpretability, opening the black box of neural networks so we can understand, debug, and intentionally design AI systems. Today, we're hearing from Eric on why researchers don't necessarily understand their own AI creations, the urgency of interpretability, and the role it has to play in building the future of AI. Eric, welcome to the show. — Thanks so much for having me. Excited to chat and yeah, dig in here. Massive congrats on everything that you've accomplished with Goodfire to date and then in particular the series B uh fund raise that you're announcing 150 million series B fund raise. That's uh no small accomplishment. — Yeah. Thanks so much, man. And I mean we're just getting started though. It still feels like such early days. And I feel like the larger the number, the more it it's a message of just you got to get to work. Yeah. And uh I feel like it's still just the beginning of building something great. — Founders usually never get to see their investment memo, but here today you get to see the memo and not just the memo, but how it changed from the seed round that we originally led up until the series B today. You ready to see it? — Yeah, let's do it. I've never seen one of these before. I'm super excited to dig in. I've raised a lot of rounds of capital. So, here we go. You know, we'll talk about a bunch of things today, but I think, you know, one of the things I want to point your attention to that we'll talk about a bunch here is the investment thesis section. Obviously, the story has evolved a lot since, you know, the seed round, but just to kind of frame up how we originally, you know, talked about the company. If you recall, you remember this idea of software 2. 0 that we talked about a bunch that we need new tools to kind of work with this new form of software. And so, I thought a good place to start just kind of the current moment that we're living through, right? So we're in this big AI moment. AI has become this ubiquitous form of new form of software as we were just saying um impacting the world from healthcare to heavy industry to hardware etc. But even today you know researchers are not totally able to understand their own creations and control the behavior of these models. — Yeah they know almost nothing about it. — And so scientists and the general public alike are really worried about you know where AI will be applied in the future where it's all headed. And so in April of 2025, Daario Amade uh the CEO and co-founder of Enthropic published uh was a really interesting essay called the urgency of interpretability uh where he described uh a case for why AI will go a lot better if models are interpretable and he pretty strongly states that we're in a race between interpretability and model intelligence. Um I thought this was a great essay and two lines I think are worth you know kind of quoting. People outside the field are often surprised and alarmed to learn that we do not understand how our own AI creations work. They are right to be concerned. This lack of understanding is essentially unprecedented in the history of technology. So I thought maybe you could kind of frame up the current moment uh for the audience and kind of explain what is it that we don't understand about AI models. I loved the essay and I think it was uh apt and just a important one and shares uh it mirrors a lot of my own thinking about kind of the field and I think really never before has there been such a big gap between how widely a technology is being deployed and our extremely limited understanding of this technology and I think that gap is only widening as AI

Segment 2 (05:00 - 10:00)

models get increasingly capable and uh widespread in charge of increasing the kind of mission critical and important applications in day-to-day society. I think the reason why I'm building this company, my co-founders, every single person at the company, we really just saw kind of a future where AI models were getting increasingly capable, unlimited scale, increased compute, more data, yet like nobody understood kind of what the nature of this intelligence really was. We didn't understand how these systems work and we didn't understand kind of if we put in this set of data, this is what comes out. This is the representations that happen in the model. And it just seems deeply irresponsible to me to be able to deploy all of these systems and mission critical contexts without a deeper understanding of kind of how these models work. There's just something kind of messed up and not really right there. That was kind of the original thesis behind uh you know starting this business and building here and uh I think it's just a feeling really shared uh I think a lot more widely than just in our company as well. — How urgent is the need really or what what makes it so urgent? What are we risking here if we don't figure this out? — I think that AI is going to have massive consequences on society and it'd be enormously transformational. I mean, we have pretty much every single CEO of all the big AI labs talking about AGI super intelligence. When we think about a system that's smarter than us, you and me, every single person we know at everything, the type of system and that promise and also just like the widespread job loss that will occur as a result and uh talk openly about that. the systems that are going to become, you know, these super intelligent systems. We just don't understand, we can't control, we can't steer, we can't shape and mold as our own. I just think there's a huge disconnect that uh to me is unacceptable. I feel a deep kind of obligation to go and actually kind of craft and shape these systems uh such that we can actually understand them and uh really be able to intentionally design them. And so uh I think that probably these AI systems will get you know to human level intelligences with or without interpretability but I just think that we'll be able to much more effectively kind of steer and control these models if we understand them. — Totally agree. You know one of my favorite uh metaphors that you've used uh throughout the lifetime of the company is this the idea of a bonsai tree as a sort of metaphor for interpretability. I think it's been in pitch decks. board decks. Really gone far and wide with the bonsai. You — got deep with it. — Yeah. Um maybe tell us a bit more about what you mean when you kind of analogize to the bonsai. — I think of these the systems that we're creating today as uh the products of scale, data, compute, and just like a lack of understanding of uh the AI models that we're building. And so you can get lots of unintended consequences of these systems like you got Mecca Hitler coming out from Grock and you have all sorts of weird behaviors of models like sticky and 40 that you can't really intend and uh we really see a future where we can actually shape these models as they're being trained as they're being grown such that we can actually put the stuff that we really want inside these models uh and actually gain uh trust uh over kind of what we really have inside of these models. So, I kind of think about the current way of building AI models as growing this like massive tree and it's all wild and knotted and gnarly and it's like the wamping willow in Harry Potter, you know, like you know, smashing around. And uh really like the future we want is craft shaping kind of really intentionally um almost like lovingly crafting the systems that we want that can actually serve humanity and really be aligned kind of with our values and morals that we really care about. Uh and I think that's really the future that I want to live in where uh we can deeply trust the AI models that we deploy really everywhere. — And I don't know anything about taking care of actual trees. — I don't either. Yeah. — Yeah. The plants in my apartment die. — Yeah. I was going to make the same joke. Um but it's hard and I think it's fair to say um interpretability is not easy either. What is it that makes looking inside the black box of these models so difficult? — Interpretability is a hard problem. Uh these models are getting really big. Um the most recent models that were uh some of the recent models that we're working with Kimmy K2 thinking that's a trillion parameter models. Uh trillion is a big number. And when you look inside the code of an AI model, a set of weights and parameters of the model that don't mean anything to the human eye, they're just a bunch of random looking numbers that you can't

Segment 3 (10:00 - 15:00)

make heads or tails of, that you can't read, that you can't hand code or program. When you think about it, it's of course you can't architect and craft this software effectively with high reliability if you can't actually debug the software. You can't even look at it. You can't edit it. We're missing kind of this fundamental set of tooling that allows us to actually uh gain a good and deep understanding of these systems. So, we lack the fundamental primitives for intentional design. I want to kind of turn back the clock a little bit to a primitive era in interpretability. If you think back to, you know, convolutional neural nets when they really started to take off in the era of um you vision and you computer vision and you could see in some of the later layers of these models like these different whether they were shapes or edges or different kind of features that these models were picking up at different, you know, layers of the network — edge detectors and — Yeah. Exactly. You know, in a sense that was interp. Mhm. — But obviously we've come a long way since then. — It'd be helpful if you could give maybe just an overview of the kind of history of interpretability. In particular, maybe highlight what breakthroughs really made it clear to you that there was something here that we would eventually be able to understand these models. — I think of interpretability as as three eras. Um, and it's a really young field in terms of modern neural net interpretability of kind of deep neural networks. There was really nothing to interpret before there were really interesting deep neural networks that we were about to deploy across the world. So this is kind of why uh the field is pretty young. I would say that kind of this strand of thought started at OpenAI with the clarity team that was started by uh Chris Ola who's now a co-founder of anthropic and uh Nick Camarada who's um one of the early uh members of the Goodfire team and uh was really influential in kind of shaping the early days of the business and they were primarily studying vision models and uh really trying to understand inception v1 which is this like tiny little image model um for image classification that is probably the most studied neural network really ever uh ever. And what they kind of posited with their kind of really influential circuits thread was that there are features inside these models that you can kind of extract and understand is these models were actually doing something rather than just kind of like randomly applying a bunch of heristics. Two is that there are circuits. So, uh features that kind of fire together uh such that you can form more complex concepts like uh the example that they used was a window detector and a tire detector and a couple other things combined to be a car detector and that's a circuit uh a set of features that fire together. So maybe that was like error one of interpretability. Era 2 is paradigm of training this type of interpreter model to extract concepts from these models. So when you look inside models, they're almost like giant compression algorithms that uh compress, you know, the entirety of the internet into a relatively small number of neurons and parameters. And what this often results in is these neurons and parameters uh multiple of them fire kind of at once uh in order to form like a higher level concept. And often each neuron is also responsible for encoding multiple concepts at once. To resolve this um a couple folks I think it was um uh Lee Sharky in a group um at Apollo research kind of proposed applying sparse autoenccoders to language models to essentially map these concepts to a higher dimension of clean interpretable concepts. Enthropic also I think around that time was also thinking about sparse autoenccoders and then popularized it with golden gate claude and that's maybe like second era of you know training this type of interpreter model to extract concepts. I hope that we're kicking off really this third era of intentional design. Um this is not fully uh released yet but more will come soon. But um we figured out a lot of the core uh problems with steering and guiding uh the training of models um using the uh interpretability techniques that we've developed. So really this era of like kind of like a bonsai tree. How do you actually prune and shape a model during training such that we can precisely edit these models, get the representations that we want and uh actually gain trust over the training process. Um maybe like underlying all of that too is this concept of um AI agents being able to do everything. So I think GBT4 was also kind of an inflection point where that was the first model that could describe uh neurons of another model. And so there's almost a uh really helpful recursive element um to interpretability

Segment 4 (15:00 - 20:00)

that with increasingly powerful AI systems, we are able to automate more and more interpretability techniques because when you're looking at a trillion parameter model, you can't describe what every single neuron does trillion. It's way too much. Uh so this idea of automated interpretability that and Nick Camaretta um kind of pioneered at OpenAI um was also kind of like underpinning really everything like given enough time all supervised techniques become unsupervised with AI. — Amazing history lesson. Uh professor ho — I want to turn from the history of interpretability to uh the history of you starting the company. in 2023, you stepped down from your executive role at your last startup, Ripple Match, and I think it was around that time that we met, batted around, you know, various ideas. Uh it was very clear to me how passionate you were about AI, and it was in June of 2024, um where you raised the seed round that Lightseed led uh to build this set of AI interpretability and model editing tools for large AI models. The idea really resonated with me as someone who spent a lot of time in, you know, kind of classic developer tools because you were basically saying like, hey, you should be able to directly modify these models and their behavior in a precise atomic way in the same way that we do with, you know, traditional software. And so I'm curious like what got you to start the company? You know, of all the things you could have done, why this company, interpretability, goodfire? I left my last company because I grew it just something didn't sit right with me that we were about to kind of deploy these massive AI systems and we just I just kind of saw the future of scaling laws and scaling up compute and resources to build better and more capable systems and we didn't understand them and that something just felt deeply wrong to me as a developer as an engineer that we're just going to kind of go and release these systems that we don't understand out into the wild. And so the concept of interpretability was always like really interesting and uh big if true. The real question the loadbearing part is the if true uh statement. And so a large part of kind of my journey between the transition and kind of officially starting the company was kind of getting over a lot of these hurdles like could we actually make the technology work? in order to gain confidence there. I in short just talk to people much smarter than myself in the field of interpretability. So Nick Camarado was influential. My co-founder Tom um who founded the interpretability team at Google DeepMind was um really helpful in forming uh the early concepts ideas. Uh Lee Shy, who we just mentioned, who um kind of pioneered a lot of the use of sparse autoenccoders on interpreting models, introduced me to Tom and is now um the head of our London team. These folks and the kind of passion and clarity of the future that they saw um really helped me kind of get over all of the kind of doubters in the field. And also just like I feel like we've done harder things than interpret neural networks. We like sent rockets to space. That seems harder to me. I think I just felt like a tractable technical problem that we can go and solve. — That's awesome. So anything less hard than going to the moon, you're willing to do — pretty. Yeah. I know you you gave us some credit for uh our original sort of investment thesis being relatively consistent to how you sort of think about things today, but you know visions evolve over time and it's never uh you never capture it perfectly at the genesis. So I'm curious, you know, maybe you could articulate what was the original vision and how would you define yourself today if at all differently? I think it's largely similar in that kind of the huge unlock is still how do you articulate the science of neural networks such that we can intentionally design better safer more powerful AI systems that we can really kind of deeply trust. I think one thing that we didn't anticipate was how useful this would be in non- language modalities as well. Also we do a lot of work with life sciences and genomics and epigenetics, digital pathology, starting to work with pertabbation models. Um so a lot of really interesting things there and uh also in kind of um image models and there's a lot broader kind of usefulness to interpretability than originally anticipated interpretability tooling. It's also just a really hard problem. So, we've spent a lot of the first 18 months of the business building the best frontier interpretability research team that can really push the frontier of what we really know about these systems. And it's still largely an unsolved problem. The amount of green field is still staggering in the field of interpretability. So, we just need a lot more people to kind of like answer the call and uh really

Segment 5 (20:00 - 25:00)

deeply try to probe and understand what's actually going on in these systems. One of the things you alluded to there was um and I think it's been true about Goodfire since the start is that uh you always sort of are ahead of the game as far as how the rest of the world and market is thinking about the applicability of this technology. Why do you feel like interpretability is ready for commercialization? — Back then it wasn't in short. So we had to kind of create the market, create the tooling, do the research that was necessary in order to kind of cross the chasm of building really anything of value. And there's still so much more work to be done there. Even on the research side, yeah, it was a lot of work to kind of get us to the point where we could uh monetize um even something and commercialize it. But I think we've gotten to the point where now we have a whole platform for model design where uh you can intentionally design your models and understand them and interpret them and modify them. I think it was just a bet that we'd get there someday. — Yeah. — And it was a little faster than I actually guessed at the time. I thought we were going to be trudging through the mud for a few years before we would get anywhere interesting. — It's a fair point. When you started the company, it was early. I think I told you that we were going to trudge through the mud for a couple years. Uh, and that's the path that we were going to take no matter what. We're going to make this happen. Um, no matter how long it took. — You mentioned the platforms. Let's talk a bit about that. And so back at the, you know, at the seed, you know, I mentioned we had been talking about this idea of software 2. 0. And so, uh, in September 2024, you launched the first version of Ember, which was really the first hosted interpretability API. And the idea being to allow engineers to in some sense to code a model's thoughts and give direct, you know, programmable access to those uh internal representations. — Mhm. — How different is interpretability to prior ways to sort of direct the behavior of uh of of models? What are the kinds of things that uh a customer of Goodfire, user of Goodfire could do that maybe, I don't know, traditional fine-tuning couldn't achieve? — Yeah. So, first of all, I still kind of like ID for software 2. 0. I still really like that — vindicated. — Yeah, I still kind of like that it indicates that there's some set of tooling and primitives missing uh that we really need in order to engage with the with models as we could with written software. So the primitives are still yeah like understand edit and debug this uh the set of technology. So it depends on kind of the modality that we're working in for what's kind of most useful. But you can kind of think about our product as really a model design environment where you have access to all of these new sets of primitives that you otherwise wouldn't have access to before. It starts with training this type of interpreter model in order to almost like map the mind of the model, map the lost landscape such that we actually understand, you know, what's actually going on inside. What are the mechanisms of action in inside these models? Then we want to explain kind of once we've mapped these mechanisms and concepts like what do they actually do? What are they? How can you use them? And now that we have this set of primitives, we can then kind of combine them, use them to monitor the model, use them to intentionally design them and train them and align them to what we want. We have a couple examples. Mayo Clinic and Arc Institute and one of our partners, Prima and the Life Sciences space. Um zooming in on Prima. They have an epigenetics foundation model that they're training in order to solve and cure uh neurodeenerative diseases like Alzheimer's and Parkinson's and their model is state-of-the-art at early diagnostics for Alzheimer's. So we've first started by, you know, mapping the mind of that model. What's actually going on in inside? Uh it's state-of-the-art. So we know that there's something in its weights, parameters, neurons that makes it better than, you know, uh our understanding, our current understanding of Alzheimer's. And in doing so, in actually being able to explain what's going on in the model, we could help them one, debug it, make sure we can remove confounders because typically in healthcare, you're dealing with noisy and sparse data. So we can remove the bad stuff that it uses that it shouldn't be using in order to generate a prediction and also uncover why it's superhuman. Why is this model special? And in doing so, we've actually uh discovered a new biioarker for Alzheimer's with them that I think maybe by the time this is out that we'll have announced. It's just hugely exciting and I think is the first instance of new scientific discovery in the weights of a biology foundation model and that's just the first of many more to come. — Yeah, I really personally love the life sciences uh use cases in our original investment memo that was not there. That was — it was but it's really interesting, right? I mean in some sense again I think it's well aligned with just

Segment 6 (25:00 - 30:00)

sort of the story of interpretability which is it's always more useful than people think. M and uh this idea that you know I think in traditional LLMs because we can read what the model is saying you can kind of convince yourself that you understand what it's thinking but in these other domains these other areas where the model doesn't isn't speaking English it's speaking you know DNA sequences or whatever it is interpretability is the only way you understand in some sense what that model understands and why it's doing what it's doing and it's especially critical because as you mentioned with some of these uh organizations you're working I mean, these models either now or in the near future will be superhuman in those domains. — Mhm. — Yeah. Already are in so many ways. And yeah, when you're trying to train a model and it doesn't speak English and it only speaks nucleotides, it's like pretty unhelpful. — Uh and so — but biology was my worst on it. So that's yeah, really helpful to me in particular. — Yeah, unfortunately it was mine as well. So uh but yeah, I think it's just critical for all of these, you know, non- language modalities. And people think AI, most people just think chatbt right now. And so I think uh there's just so much more that we can help with. So right now we spend most of our time working with biology uh and biological domains as well as in language still. So that's the two uh primary use cases. — You've talked a bit about the team already. Um, but I want to kind of double click on that uh because it is I think one of the most special things uh about the company and it's evolved a lot since uh the early days from our original seed memo in the um you got team section and I think the direct quote goodfire hopes to grow the team to 10 in the first year um largely focused on research and engineering. We are now 18 20 months on or so. How big is the team today? It's around 40 people growing very quickly. I think in a year we grew the team to uh it's like 20 25 people. Um much faster growth in recent months and have hired a lot of the folks that we were talking about targeting like Nick and Lee and uh just kind of a lot of the early pioneers of the field of interpretability. So — were you expecting to grow this quickly? — No. I thought it would take a little bit longer in order to both convince these folks to join and also to push forward the research to the point where we could um to an inflection point where we can kind of go out to market and commercialize. — What would you say makes this team so special to you? — I think that all of us are kind of unified by this belief and this mission that there's something not quite right about the way that we're building AI models today. It feels reckless. It feels like the only thing we care about is scale. And I think scale is deeply important and kind of the magic behind how uh this intelligence is formed. But there's almost lack of curiosity with most researchers today. Like why is this happening? How can we possibly have these enormously complex alien minds almost like crash landing on Earth and not be curious about this? So, it's both like a deep curiosity, but also kind of like a shared vision of a future where we just really want to knock the world to a future where we actually understand the systems that we deploy. And I think that's just a shared feeling from every single person in the team and a really special place to be because there's both like a lot of joy and awe that we get from kind of just looking around and seeing the moment of time that we're in. It's like a it's just so fun like thinking about how these models think and work. There's something deeper there. there's like an obligation a responsibility for us to kind of push forward this science because we also see how things could go very wrong without our contributions. — What is it about the independent context in which Gridfire operates uh that lets you have the impact that you've been having that you will have? pretty much everyone, every researcher gets offers from Anthropic, Demine, OpenAI, and they choose to come here and take a pay cut to come here. And I think it's increasingly obvious that this is the place to kind of build towards the future vision of intentional design. I think it's a research agenda that, you know, the other labs probably don't share. they view interpretability a lot more as uh maybe like a post hawk auditing technique whereas we really view this as kind of the key lever to crafting and shaping these models um intentionally and kind of the major unlock that we actually need in order to design these systems effectively and safely. I also think this is just we have a big advantage in that this is everything that we think about. I wake up every single day and I think about interpretability and I think that

Segment 7 (30:00 - 35:00)

focus gives us a huge advantage. Our infrastructure is set up top to bottom through the lens of interpretability um which requires like access to intermediate layers of models and that's not really easy to get when you're entire infrastruct when you're at a big lab. We have a kind of core technical advantage in kind of what does a bottomup organization look like? And just to hit on that again, just to give you guys credit for being early and being willing to do this independently, another just quote from the memo. Goodfire has the opportunity to pioneer a new category centered around interpretability tooling with the potential to become an essential building block for responsible AI adoption and deployment. — All right. — Goodfire is an early mover in this space, which has historically been the domain of AI research labs, but finally seems ready for productization and commercialization. And so that leads to uh the first fund raise. And so you know, Lightseed Ledger uh seed round was a $7 million seed round June of 2024. — Really not that long ago. A lot has happened. Feel like that was maybe like a decade of my life. — Exactly. Yeah. 9 months later you raised your 50 million series A and then today uh you're announcing you're raising 150 million series B. That's incredible. Congratulations. Um, thank you for letting us partner with you on that journey. And it has been a journey. We were talking earlier about just uh, you know, our first coffee chats uh, back when you were still just thinking about what to do next. And, um, it's just amazing to see how far you've come. you were very clearly focused on assembling the best set of people whether it is you know Dan one of your co-founders someone who you had years of experience working with from your prior time together at ripple match I could tell even from the first time that I met both of you together that you guys just sort of fit like uh I don't know what the analogy is hand in the glove or some kind of thing with hands and gloves you it was a fit — yeah I worked with him for so many years now and he's uh yeah just unbelievable at what he does — and then Tom um I mean just like a genius and someone who has contributed so much you know having started the interpretability team at DeepMind and among other accomplishments and it meant a lot that someone like him would want to kind of join you in this endeavor. There's lots of partners that you've had the chance to work with whether it's co-founders but also investors. How important is the capital stack that you've uh assembled to go after this vision that you've articulated? — I think just deeply important. I mean I my last company was a hiring company. So this is a just like building the best team is kind of core to my DNA and what I spend a really large percentage of my time thinking about and you know just I spend a lot of time recruiting and I've spent in the last decade and uh just the team and the people are really everything in a company and uh I think we took a really similar approach to choosing our investors. that we had the opportunity to choose from a number of investors in every single one of our funding rounds and we just brought along the people and uh chose the partners that we just deeply trusted that really got what we were trying to do. It's really easy to take a maybe short-sighted view of interpretability, not go for the really big kind of pushing forward of frontier interpretability techniques. And I think choosing the right partners to go on this journey with uh that we just deeply trust and can get along with uh is just really important. — I'm super grateful. Lightseed is super grateful that you picked us to work with you from the very you know beginnings of this journey. Three fundraises in roughly 20 months is you know it's a lot of activity. — Yeah. I spent a little bit too much time fundraising. Not enough recruiting obviously but uh — you're doing something right you So there's a lot going on at Goodfire. in interpretability broadly. AI broadly. — Mhm. — As you think to the future, maybe kind of paint a picture for where the field of interpretability is going. What breakthroughs are on the horizon either coming from Goodfire or elsewhere that get you most excited? — I am deeply excited by and motivated by this vision and future of intentional design that we're creating. I think that's just the future of interpretability where in order to be most useful, we just have to be part of molding and steering the training process of models um such that we can craft the systems that we actually truly want. That's where we're spending um a bunch of our time and uh we'll announce a couple key new products and breakthroughs uh really shortly. Yeah, we're just hugely excited about our new ability to really edit and steer these models. — Anything that worries you? — I mean, these models are getting good so

Segment 8 (35:00 - 40:00)

so quickly. When Cloud Code first came out, that was a little bit of an oh moment for us, you know, across the entire company. We were wondering, do we have enough time? do we really have enough time to kind of actually knock the future into the world that we wanna that we really want to see? And you know, going back to the urgency of it all, uh I really do see that uh and feel that every single day where um these systems are getting increasingly capable and impressive. I just really want to make sure that we kind of change the paradigm of how we craft these systems such that we can understand them and intentionally design them. What are the big challenges not just for you but even just like the field broadly? — I think that we're still in the very early innings of being able to understand AI models. We can extract kind of a lot of concepts from models right now but we're still just scratching the surface of you know the quality of interpreter tools that we have the quality of the representations that we extract from models. And we are just fundamentally bottlenecked as a field on talent, on compute, on resources. Uh there are just not enough people who are curious about these problems that want to kind of dedicate their lives and their careers to solving these problems. It all comes back, I think, to hiring. How do we just bring on the best possible talent, the best possible people that maybe aren't directly in interpretability now, but are, you know, training models at bigger labs, um, who want to and are curious about how these models really work and who want to play a part in the future that we're creating, not the one that currently exists. One of the things that got us most excited you just on the talent point was you to your point there's just never enough people who are you know really understand this and are deeply passionate about this — maybe like a few hundred researchers focused on this full-time right now — but by being this sort of lighthouse company an independent company going after this problem and dedicating 100% of your time to just this problem you can really become the predominant place for this kind of talent — to work and I think I think you've done that um in and in such a short period of time. — Yeah. Well, we have $150 more million dollars to go and hire the best possible folks in interpretability research and engineering. And it's not just researchers, it's engineers, it's go to market, it's operations, it's you know, really across the board. Yeah, let's go. There's a lot of work to be done and I feel like uh there's people out there that probably kind of see the future that we do that want to shape it the way we do. And I think a large part of my job is kind of yeah like almost lighting the beacon of interpretability here in San Francisco. If you see the future as we do like come on let's go let's get to work. — No shortage of things to be done but on the theme of uh lighouses beacons and I guess just bright lights in general. Is there a north star that you would highlight um for your kind of next phase of growth? Yeah. Uh the northstar is intentional design. Like how do we truly understand and um intentionally design these systems as we can with written software and building the reliability and trust in these systems that I think we all really want and desire. In many ways I think of what we're doing is building a bridge between an AI mind and ourselves. It's most obvious that we have no bridge right now between a genomics model mind and our own. there's like a bigger, you know, gap to bridge. But also being able to shape these models, we want to be able to give feedback directly into the mind of an AI model on why it did something wrong. There's almost no information that you're communicating to the model uh when you're giving it reward right now. But we are really building the bridge where we can actually say you got the answer wrong for this reason. You should be thinking about this as you're generating your response. So the why really matters. Why a model gets to an answer I think is more important than getting the answer right. So you get the right answer for the right reasons rather than wrong reasons. That's really the future that we see where we can go that direction to AI models and really give natural language feedback to craft a model. and the other way around. This idea of superhuman knowledge transfer from AI to humans as AI becomes superhuman in an increasing number of modalities and domains, that's going to be increasingly important or else we'll be left behind. Interpretability really is the only way that we can transfer knowledge that AI models have about our world that we don't have back to humanity. So this is going to be one of the best ways to progress scientific discovery and knowledge uh very soon. This kind of all started when um Tom and his collaborator collaborators at Deepbind uh were interpreting AlphaZero, which is better than any human chess player alive. And uh they got a chess grandmaster to be on their paper and taught him how to ch play chess more effectively in very narrow positions, but was just kind of that

Segment 9 (40:00 - 40:00)

first case of human AI to human concept transfer. And I see that happening really across the board in every modality where AI has a fundamental advantage over humans. Awesome. Well, chess tutors for everyone, I guess. — Yeah, that's right. Superhuman chess tutors. — Exactly. Eric, thank you so much for doing this. It's like a really special moment for me personally. I know it's special for Lightseed broadly. Again, we really appreciate you working with us. We're just really grateful for the partnership and and congrats again for all the success so far. — Thank you. This is so much fun and also just like deeply grateful for the partnership from the beginning and uh it's still really just we're still getting started. There's still a lot more to go and build and super fun working together. — Yeah. Good. Well, I would say roll up your sleeves, but your sleeves already are — fearful — as they always are. Very good. Thanks, man. — Cool. Thanks

Другие видео автора — Lightspeed Venture Partners

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник