Inside the lab training AI to cure diseases | Sam Rodriques: Full Interview

Inside the lab training AI to cure diseases | Sam Rodriques: Full Interview

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI

Оглавление (7 сегментов)

Segment 1 (00:00 - 05:00)

- Even if we had all the information we needed in order to understand how the brain works, we wouldn't necessarily know it. Just to give like a specific example, there are 20,000 genes in the human genome. You don't have enough time to go and read about all of them. And even if you could read about all of them, you wouldn't remember by the time you got to the end what you read about at the beginning. And so that's basically what convinced me the most important thing that we could be doing today is going and trying to build an AI scientist to build something that is better than we are at understanding complex science. These new tools that we have, do they allow us to make the discoveries that will get us there? Whether it's AGI, whether it's superintelligence and so on, all I care about is are we making discoveries? Do the tools allow us to do that? Well, I'm Sam Rodriques. I'm the CEO of FutureHouse. We are a research lab in San Francisco focused on building an AI scientist to automate discovery, to automate research in biology and other complex sciences. (gentle foreboding music) - [Interviewer] All right, Sam, so we're gonna start with a super easy one. What convinced you that sort of a multi-agent AI scientist could potentially help close the gap between stalling in science versus, like, adding more PhDs to help solve that same productivity? - Yeah, yeah. So, I'm a theoretical physicist by training. So I started out doing quantum information theory. I ended up feeling like there were no unsolved problems left in physics. And so I moved into biology and neuroscience where everything is unsolved. And one of the things that you learn when you're, you know, studying neuroscience, like I really wanted to understand how the brain works. And one of the things that you learn is that even if we had all the information we needed in order to understand how the brain works, we wouldn't necessarily know it because no one has enough time to go and read all the scientific literature. And even if they could read all the scientific literature, they would, like, wouldn't be able to hold it in their head for long enough to like assemble it into some comprehensive whole, right? Just to give like a specific example, you can say the same thing about like the cell. There are 20,000 genes in the human genome, right? And even if you could read about all of them, you wouldn't remember by the time you got to the end, like, what you read about at the beginning, right? And so that's basically what convinced me that the most important thing that we could be doing today is going and trying to build an AI scientist to build something that is better than we are at understanding complex science. Biology is like a mixture of complex, like, interacting... it's like a complex interacting system, right? Where like all the genes are interacting with each other and many, many of those interactions are very important. It's hard to imagine how you could actually go and like understand something complex, like how a cell works or how the brain works without actually considering each one of those individual components. But in a world in which like, you know, humans don't have the cognitive capacity to go and hold all those pieces in their heads at once. We need something that's better than we are at synthesizing information. - Chapter 1: Building an AI scientist - [Interviewer] What do you consider to be an agent? And how does that dovetail with the core pieces of your platform? - Okay, yeah, awesome. - Falcon, Owl and Phoenix. - Yeah, exactly. So the notion of an agent comes like originally from, like, the reinforcement learning field. And reinforcement learning has gotten really hot recently because its role in language models, in training language models, but reinforcement learning goes back way further than language models. An agent is just something, it's like an AI system that can take an action, and then observe a result, right? Make some observation. If you just have a model for example, it gets some input and it produces some output that is like not an agent, right? Like language models on their own are not agents, right? When you put it into an environment in which it has an input, takes an output, and then the environment changes based on its output and then it observes the environment again, you would then refer to that as an agent, right? So I like to give the example of playing Go. If you have a model that takes as input the current state of the Go Board and outputs the next move, that's just a model, right? You would not call that an agent. If you put it into a setting in which it takes its input the current state of the board, it outputs the next move, and then the board changes in response to its move, right? Because, like, the other player, it's opponent then takes a move, then you would call that an agent, right? It all has to do with the system being in an environment where it can take actions, observe results, and then take another action. The concept of as we apply it to language models, like same thing, language models are not on their own, like, agentic, right? A language model just takes as input some natural language and then output some more natural language, right? But when you put it into a setting in which, for example, the language law has access to tools, otherwise interacting with an environment

Segment 2 (05:00 - 10:00)

then we would refer to that as a language agent. This is important because what we work on at FutureHouse is like true agents, which is to say that we work on systems that actually consists of like language models with access to scientific tools. This is in contrast to like actually what most people are building when you hear about agents. Like most papers that talk about agents or whatever, like multi-agent systems, et cetera, are often actually not true agents, right? Most things that people are working on that they call agents are actually just like, you tell the language model to do X, then Y, Z. There's like no actual decision making on the part of the language model. That's not an agent. You can get interesting behavior from systems like that also. But what we do at FutureHouse with agents is that we're like really trying to put the decision making into the hands of the language model. We try to make sure that it has like the freedom to go and decide what the right actions are to take at any given step. One of the really cool things about our platform is the fact that actually like we've built the platform in a way that allows the agents to interact with each other, right? So we have several different agents. We have Crow, which is like a kind of general purpose agent that has access to the scientific literature. And so Crow is really good for any time when you want to do something in science that is informed by the scientific literature and it doesn't have to be too specialized. So for example, like a good use of Crow would be you wanna know, like, what parameters should you use in a given experimental protocol, right? Crow will like go and search literature, find some relevant papers, come back to you with those papers and so on. Crow is also very good for like generating new hypotheses and things like that. Falcon is a much deeper literature search agent, and Falcon is really good for things that require you to go and search a large amount of literature and synthesize information for that literature. So for example, going and performing, like meta-analyses or so on, right? You want to go and say, "Of all the papers in this field, how many of them use Method X? " Or Y? Or like, you know, what is the average, you know, of all the papers that have ever studied the effects, the causes of metastasis in breast cancer or whatever, what are the most common causes that are cited, right? That requires you to go and read a lot of literature, kind of analyze that literature, that's something that Falcon is really good at. We have Finch, which is a data analysis agent. You give it a data set, it writes code, it can run the code, it can see what happens after running the code. You can do very open-ended things. You can give it a data set and you can say like, "Based on this data set, what can you tell me about the causes of a metastasis in cancer? " Right? And it can go and it can look at like what kind of data set it is. And then it can say, "Oh, well, I'm gonna go and I'm going to look at the correlation between, you know, thing X and thing Y," for example. And then it can write some code, run that code and we'll get a result. For example, it might get a correlation value or plot that it can look at. And then, based on what that plot looks like, it can literally decide, "Oh, you know, that actually wasn't that interesting. I'm gonna go and do something else instead. " In the same way that like a human doing science is gonna go like search the literature, find a paper, based on what the paper says, is gonna go and try to, you know, say, "Oh, maybe I'm searching in the wrong direction," right? Like, you know, maybe actually this thing that I'm thinking about is like not the cause of cancer metastasis. Maybe it's this other thing instead. That's like a decision, right? Where like you've performed some action, you've done some data analysis, you've gotten an observation, you've got the result of the analysis, then you're gonna go and decide to do something different instead. Those are the kinds of decisions that we are putting into the hands of the models. So another one that we have is called Owl. It's an updated version of a tool that we launched back in September called HasAnyone. If we call it like a precedent search, it is specialized to answer the question, has anyone done something before? Has anyone done X before? I think people who are familiar with science who do a lot of science will recognize that you spend a lot of time asking the question, has anyone done something before? And if not, why not? For example, if you come up with a new hypothesis, you wanna know, wow, is this hypothesis actually like novel? Has anyone tested this before? That's a great example of how people would use Owl. And frankly, it's a really good example in general of just, like, how people use the scientific literature. I think like one of like several main purposes of the scientific literature is to be able to answer that question. So that's Owl. And then, Phoenix is an experimental agent that we launched which is really built around experiment planning in chemistry. And Phoenix is like a very, very early experimental prototype. So it's nowhere near as polished as some of the other agents that we launched. But the reason that we launched Phoenix is basically because we wanted to give people some insight into what it will look like to have like these AI scientist agents when they have access to more different tools beyond just literature search and data analysis, right? So Phoenix has access to specialized tools in chemistry that allow it to go and plan reactions

Segment 3 (10:00 - 15:00)

figure out how to synthesize different molecules, look at, like, what are the actual, like what reagents would you need in order to do some kind of chemical synthesis? How much would those reagents cost? Those kinds of things. It's good for experiment planning in chemistry. I think the thing that's cool about it is that scientists have a lot of very unique, specialized tools that they use. Phoenix is just the first example of where like over time, as we give agents access to these tools, they will become able to automate more of the workflows of that scientists carry out. The interesting thing about the way that we built these platforms, that we've built it in a way that the agents can actually talk to each other. We can build agents where like, you know, the agent has the ability to go and search the literature. and call on like specialized agents. So the agent, you know, could call on Finch or the agent could call on Falcon. Basically, it can like delegate, right? And this is how you get into a real like multi-agent system where you have many different agents that are specialized for different things and they can talk to each other. The vision, like where we're going, is you can imagine at some point that like you want to come up with hypotheses for, you know, what are the molecular drivers of cancer metastasis, for example. You're gonna have Falcon go and search the literature, gather a bunch of data sets, download those data sets, and then hand it off to Finch, which will then go and do the data analysis. And so that's where you can start to get into like really interesting complex, sophisticated workflows, right? And just by adding more functionalities, adding more agents, I think, over time, we'll be able to build up what is like effectively like a virtual laboratory or something like that, right? - Chapter 2: Radical transparency and human oversight - [Interviewer] You sort of have a very transparent workflow where you sort of seeing sort of like all of the chain of thought- - Yeah. - [Interviewer] That these models are going through. Why is it important to show those things? Like, talk to me about some of the principles that made that part of the decision making. - It's very, very important in science to make sure that, like, your knowledge is grounded somehow in like the reality of nature. Whenever you have some belief, you want to know specifically, like, where does that belief come from, right? And specifically, like all beliefs in science, like derived from experimental results because, you know, that's the only thing... Science is empirical. I mean, the only thing that we know for sure is that we can conduct some experiments and we can get some results. So as much as possible, when you have something that you believe in science, you want to be able to go back and trace it to, like, the original experiments that were done, right? Like that's the ideal. And so for us, what that means is like when Crow or Falcon or Owl are out searching the scientific literature and analyzing what is known on some topic, right? You want to be able to see exactly like every single step that they went through, where did they get the information that actually fed into the final answer. So there are citations in the answers that they give you. If you look at those citations, it gives you not just the paper, it gives you exactly the detailed reasoning of why the agent thought that citation was important for answering the question. And if you go into the reasoning trace, you can see exactly what are all the other papers that the agent considered on the way from like, you know, getting your question to coming up with the answer, right? And so that just means that it's fully traceable. It means that you don't have to worry as much about things like hallucinations. I mean, you always have to worry about hallucinations. And for the record, hallucinations regardless of whether you're working with like an agent or, you know, an AI or whether you're working with humans. It turns out humans also make stuff up, right? If there is a hallucination, it allows you to track it down, right? It's just the ultimate Show Your Work. One of the questions that I get asked a lot, where do we expect the models to be better than humans, right? Where do we expect humans to be better than the models, like, right now and in the future, right? And obviously, prognosticating into the future is always a little bit challenging. But I will say that right now, at least, I think that the best humans in the world in any particular topic will still have much better judgment than the agents that we have produced or than basically any language models than any AI that anyone has produced, right? I do not think right now that we are at the point where even the best agents are better than the best humans. What the agents are really, really good at is that they know way more about more things than anyone else. And the reason this is really exciting for biology is that biology requires you to integrate information across many, many different fields, right? In order to come up with a new drug, you need to be able to integrate information across, like, basic biology from like DNA proteins, like basic biomolecular interactions up to, like, cellular biology, how the different biomolecules work together

Segment 4 (15:00 - 20:00)

in order to make the cell go, up to, like, the disease mechanisms, up to how the disease mechanisms in the cell affect, like, the disease in the organisms as a whole, up to, like, how you would even conduct clinical trials, what is your regulatory pathway, all the way up to, like, if you could even make this drug, would it be covered by insurance, right? Like, is the medical unmet need large? And so on. You need to know all of those things. In any given area, the models that we have today, the agents that we have are not gonna be better than the best humans. They know a lot about all of those things, right? Whereas like, you know, humans are very spiky. Humans will know a lot about one thing, will know more about one thing but way less about the others. And so what we're really excited about is the fact is like this idea that even though today the agents are not better than the best humans, they may be able to come up with new ideas, make new discoveries, and overall have better judgment basically because they know more about more things. They know about more things, right? They know some about more things. So that's where we are today basically, right? Like, humans know a lot about some things. The models know some about a lot of things. Where we are going in the future is that I do think that over time, like we're just gonna keep creeping up. The model performance will just keep creeping up to the point where like eventually, the models may know a lot about a lot of things, whereas humans just know a lot about some things, right? And at that point, I think we should expect that in many areas, the models will be better at generating hypotheses than humans because not only they'll have, like, a similar amount of knowledge in the specific domain, but they'll also combine that high level of knowledge across like many, many different domains. Now, I don't think that this means that humans are gonna be like obsolete immediately or anything like that, right? And the reason I say that is we should expect that AI is very good at removing the intelligence bottleneck for progress. So wherever the bottleneck is intelligence, AI ultimately will help to remove that bottleneck. However, it is not the case that everywhere, like, intelligence is not always the bottleneck. There are many things for which intelligence is not the bottleneck. A classic example is if you want to like, you know, mine rare earth metals from the ground, right? Like, you know, maybe if you were more intelligent, you would come up with like a better way of mining. But fundamentally, at the end of the day, there's like, you have to actually literally go and get the rocks and like break the rocks open and get the heavy metals out. And even if you have the like absolute best, most intelligent way to do that, it's still probably gonna take a lot of time, right? And so that's an example where, like, there will be fundamental bottlenecks where we should expect that even the most intelligent thing will not be able to accelerate them beyond some limit. In biology where this shows up is, you know, we have fundamental limits. Like for example, if we had a cure for aging, right? Today, it would take at least five or 10 years until we actually knew that it was working 'cause that's the timescale on which humans age. So even with like the most intelligent model possible, we should not expect that if we had a cure for aging today, that we would know that it was working for more than a few years. This is like really important to recognize because it puts a lower bound on how quickly we can expect to be able to cure diseases. I do think we'll many, most, like, all maybe diseases. It's like a little bit, the jury is out, but it will take some time because intelligence is not only the bottleneck. This also points to like what the role will be, like, where I think humans will not be replaced immediately. I'm not even sure that humans will be completely replaced at all in science. I mean, it's very hard to say with these things, but I think it's like unlikely that humans will be completely replaced in science because there will be many, many decisions to be made where having more intelligence doesn't help. So for example, where you're like, look, there are two different ways that we go about curing this disease or that, you know, two different kinds of trials we could conduct. There's good evidence on both sides. It's not obvious which one is better. More intelligence doesn't help you to figure out which one is better because you're fundamentally like epistemologically limited. You're limited just by how much information you have, how many experiments you've conducted. And so humans can play a role there, like the AI may not be better at making those decisions. Humans will play a role there in making the decisions as well. - [Interviewer] I'm curious to get your thoughts on what do you think AGI is and do you think it's even worthwhile for us to discuss that versus building practical tools that achieve a specific result? - Yeah, that's a great question. People use AGI to refer to this idea of a system that is like better than humans or comparable to humans at basically everything. Language models seem to exhibit a bunch of the properties that we associate with intelligence. Their intelligence works in a way that's very different from ours. And so we should expect that for a while, there will be some areas where humans are better

Segment 5 (20:00 - 25:00)

and some areas where the AI is better. And over time, maybe the AI could overtake the humans in all or most areas. The important thing to recognize is that for a lot of applications, it doesn't necessarily matter. When we're working on these models, one of the things that we often get asked is: how close are these models to AGI? How close are they to superintelligence? Do we need superintelligence before we can expect like AI to make new discoveries or things like this? I guess my typical answer to that, you know, AGI refers to this notion that you could have an AI that is way better than humans or on par with humans across basically all tasks. Superintelligence, you have an AI that's way better than humans. My basic answer to this question is I'm not sure that it matters, right? So I was a theoretical physicist originally. And back in physics in the kind of '50s and '60s as people were developing quantum mechanics, there was a lot of kind of philosophical discussion around the implications of quantum mechanics and, you know, the many worlds of theory. And what emerged was a camp of school of thought which was like the, you know, shut up and calculate school, which was like, yes, there are many different ways of interpreting quantum mechanics and thinking about what the philosophical implications are, but fundamentally the predictions associated with those different philosophical implications are the same. And so to some extent, like, who cares, right? As long as you can make predictions, it doesn't matter. And so this is like, to some extent, the philosophy that I have as well around AI, which is I'm not sure I care whether it's AGI. I don't care if it's superintelligence. What I care about is are we able to make new discoveries that will allow us to cure all the diseases, extend human lifespan, feed the entire planet by making crops that are much more productive, figure out how to go to space and explore the universe, right? Like, these are all things that should be possible technologically for us as humans. And the only real thing that matters is these new tools that we have, do they allow us to make the discoveries that will get us there? The rest is semantics, right? Whether it's AGI, whether it's superintelligence and so on, it's semantics. All I care about is are we making discoveries? Do the tools allow us to do that? - Chapter 3: Open science and the road to a virtual lab - [Interviewer] Why sort of take an open source approach to this? And what exactly are you expecting to see sort of from open source science in the future as it relates to AI? - Yeah, historically, AI has been really kind of like a bastion for the open source community. If you look at like the first 10 or 20 years of development in AI, almost everything that people were doing was open. Algorithms were open, models were open, et cetera, and that really helped the field to move along. Recently, as it has become much more expensive to develop frontier models, people have stopped open sourcing their models. Maybe some people say it's for safety reasons. commercial reasons. It might be a mixture of both. It doesn't matter. Certainly, the lack of open source, which is probably necessary for commercial, for the commercial applications of AI is a hindrance when it comes to technological development. Our goal at FutureHouse is just to accelerate scientific discovery, and that means just making sure that all the scientists around the world have the tools that they need in order to automate science and that they're able to build off of those tools, right? We want scientists around the world to have access to the best tools that they can have access to, and to be able to figure out how to integrate them into their workflows to make new discoveries. Our philosophy so far has just been, let's make everything that we're doing open source. We will encounter some of the similar considerations around to what extent things should be open source, what fraction of source and so on, which is just a reality of working in this field which is very capital-intensive. One of the things that you learn very quickly when you're in biology, when you're in biotech is that you think like if you discover a new drug, wouldn't it be great to make it open source and make it available to the world? It turns out if you discover a new drug and you make it available in open source to the entire world, then no one is incentivized to invest the money needed to take it through clinical trials and show that it works and that it's safe and effective. And as a result, your drug will die. You know, in biotech, we learn very quickly that in these capital-intensive fields, biotech is also very capital intensive, you have to have intellectual property protections in order to be able to invest the capital needed to make something that really works and is usable by people. There may come a time at FutureHouse also where we basically say, you know, we have to keep things closed source in order to be able to invest the capital in them that we need to invest in order to develop them. And at that time, you know, when we get to that kind of stage, we may also think about, like, does it make sense to spin out companies for application X

Segment 6 (25:00 - 30:00)

or application Y or application Z, right? I mean, FutureHouse is nonprofit and we're dedicated to automating basic discovery research in science. But, you know, as we find applications, applications that are extremely capital-intensive that are also likely to yield commercializable advances, we may spin companies out to take advantage of those. This is just what you have to deal with when you're working in a very capital-intensive field. One of the first really, really cool things that we've been able to do with the platform is to actually like identify what could be, like, a novel treatment for age-related macular degeneration, which is a major cause of blindness in people over the age of 50. And, you know, I wanna preface this by saying it takes a lot to go from an idea to go from a concept to a cure, right? And so, we're not out here saying, "Oh wow! The AI has discovered new cure for blindness. " But what I can say is that with these models, we were able to connect to, like, follow some breadcrumbs and identify a novel hypothesis about how we could cure this form of blindness, age-related macular degeneration, that seems very promising. We assembled our models into a kind of like hypothesis generation configuration where they were going through the literature and figuring out for age-related macular degeneration, what are all the possible mechanisms that you could use to treat that disease? And then, what are all the ways that have been explored in the literature that could affect those mechanisms? And for all the possible ways that they would find, are there any of those mechanisms that look like they would be likely to result in treatments for AMD that have not already been explored? And what we found when we did that was that there's a specific category of molecules which are called ROCK inhibitors, Rho-Kinase inhibitors. They were known especially in the immunology literature that they have an effect on a cellular phenotype called phagocytosis. It's a word that describes like cells eating stuff out of their kind of local environment. And so the immunology literature knew that these ROCK inhibitors have this effect on phagocytosis. The ophthalmology literature that is to say people, clinicians who work on eyes knew already that phagocytosis is highly connected, is closely related to the pathophysiology of age-related macular degeneration, right? But this connection between ROCK inhibitors and AMD had not been made in the literature previously. And in particular, the connection that had not been made in the literature previously was that there is actually an already approved ROCK inhibitor that it looks like maybe, you know, that molecule may be directly applicable as a treatment for AMD. And so that was the novel hypothesis that our models came up with, which we then took that around to several different ophthalmologists. And, you know, the ophthalmologists looked at it and they said, "Hey, this is actually pretty interesting," right? This looks pretty novel and it's pretty interesting and so maybe it's worth trying. So then, what we did was that we went and we actually tested those ROCK inhibitors in a cell culture assay for AMD. So this is like a first kind of experiment that you can conduct to see whether this idea holds water. And it actually worked really well. They worked very well. They even worked better than the positive controls that we had in that assay in the wet lab. The first experiment we did was that we took, like, you know, the model basically came back and it said, you know, here's a molecule, it's a member of this class Rho-Kinase inhibitors which could be used to treat AMD and the way we should test it is that we should go into the wet lab. And so we went and we tested the phagocytosis assay, it worked. So then we took it, we gave the raw experiment results to our data analysis agent, which analyzed them, figured out that the Rho-Kinase inhibitor had worked. We took those results, gave it back to the hypothesis generation model. goes and generates some more hypotheses including finding this like FDA-approved drug already, right? Which we're like, oh wow, like that FDA-approved drug, you know, itself could be like a treatment for AMD, which is already known to be safe and effective in other diseases. So we went and we did another experiment, we got the results of that next experiment, which also showed that it worked very well. And then again, we had the data analysis agent analyze the data. And then, we even then did another experiment actually to look at and try to understand the mechanism by which these Rho-Kinase inhibitors were having this effect on phagocytosis. And that was, again, something where like we went and we did the experiment, then we gave it to the data analysis agent. The data analysis agent was able to go and analyze it and come up with some very interesting findings. But the overall story here, right? is just that, like, this was the first example we had where the agents were able to come up with a hypothesis, we test it in the lab, the agents analyze the data, like come up with new hypothesis, we test those in the lab, agents analyze the data again. And basically, like, except for the actual wet lab work, the entire process

Segment 7 (30:00 - 34:00)

is like this automated cycle of discovery. And I just think, I mean, this is basically a model for what science may look like in the future where everything goes much faster and you have much more informed hypotheses as a result of using these agents and integrating them with wet lab feedback. - [Interviewer] How do you think people should think about the benchmarks that they see as it relates to AI models? And how does it sort of relates to this what you're just talking about? You care about predictions, you care about discoveries. - Yeah, so one of the questions that we also have to think about all the time is like how do we tell whether the models are actually getting better at doing science. The critical thing is basically benchmarking. It's making sure that you have ways of measuring whether or not the AI are going better at generating new hypotheses, coming up with new ideas, doing new data analyses and so on. We released a bunch of benchmarks for science called LAB-Bench, which are meant to measure the ability of agents to perform a variety of different tasks in science. And we continue to monitor performance on that. I think that otherwise what I would just say is that the AI field in general is drawn to benchmarks that are very easy to evaluate and very easy to engineer against. And there are things like GPQA, for example, which is like a question-answering benchmark; Humanity's Last Exam. And these benchmarks are engineered in a way that they get at like what are the capabilities that language models don't have right now that they could develop. And in general, the benchmarks are useful and they do measure something. I will say they do not measure, like is the model good at science, right? Like, the idea behind Humanity's Last Exam is that when you have a model that can do 100% of the problems on Humanity's Last Exam, it will be great. You know, that model will somehow like be better at science or have more knowledge. It'll be better at reasoning than like any human, right? Because like the Humanity's Last Exam is supposed to be like a bunch of extremely hard reasoning problems. In the same way that like, you know, if you do really well on the Putnam Exam, which is like a math exam, then you're good at something, right? That does not mean you're gonna become a world-class mathematician. invent some new theorems. And being extremely good at Humanity's Last Exam does not mean that the thing is gonna be capable of going and doing science. And I think the really hard thing actually, right? Like, is that science takes time. And so how do we measure whether or not an AI is good at doing science? You have to actually have to go and start to do science with it. We go about this by like actually trying to use the models for science internally, trying to use them to make discoveries, putting them into the hands of scientists around the world and seeing how they use them, and just getting real world feedback, basically, about is this thing making discoveries? Is it doing a good job and so on? We retain large numbers of scientists internally, like on contracts, who evaluate the models and give us feedback about them. It is harder to measure performance in this way than just making like a nice question-answering benchmark. I really wish we had nice question-answering benchmarks that could get to the question of is the thing good at science? But fundamentally, you know, this is what you have to do to really get at the performance of these models. And then, I will say the last thing there is that the ultimate evaluation for these models is are they generating hypotheses? Are they generating predictions that work better in the real world, right? And so what you really have to do is you have to have the wet lab in the loop. And this is one of the things that we're doing that kind of sets us apart is that you have to be able to, in high throughput, to go and take the hypothesis the model is generating, the experiments it's proposing, test them in the lab, get feedback, give that back to the models, use that to train the models, and also use it to evaluate the models. And that requires a lot of logistics. You have to actually have a laboratory. You have to have a lot of people in the laboratory who are able to experiments and you have to be able to do very diverse experiments. lots of different kinds of experiments. And that's the thing that we've been focused on that I think really one of the things that we're focused on that really sets us apart is, you know, as in the case of the age-related macro degeneration example. You know, the thing there that really that sets it apart is that it's not just like we came up with an idea, idea and we tried it in the wet lab. In the cell culture assay, we got feedback. From that, we gave it to the model, right? We were able to go and like conduct an actual, like real RNA sequencing assay, get real data there, give it to the model. It's the interaction between these models and the real world which is where the rubber actually hits the road and that's we're actually gonna find out, are they better at science? And how do we make them better at science? Takes me a little bit to warm up anyways. So I don't know. - [Interviewer] You seem warm to me. - Have not been, not my most articulate description of these things. But, you know.

Другие видео автора — Freethink

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник