The Future of Math with o1 Reasoning with Terence Tao, Mark Chen, and James Donovan

1:26:07

The Future of Math with o1 Reasoning with Terence Tao, Mark Chen, and James Donovan

OpenAI 13.12.2024 67 865 просмотров 1 424 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Fields Medal-winning mathematician Terence Tao makes his second appearance in the OpenAI Forum alongside OpenAI’s SVP of Research, Mark Chen to explore a future where mathematics and artificial intelligence converge to unlock groundbreaking scientific advancements. The conversation, facilitated by James Donovan, Science Policy & Partnerships Lead at OpenAI, will also explore the potential these advancements hold to impact society in positive and transformative ways.

Оглавление (18 сегментов)

Segment 1 (00:00 - 05:00)

I'm Natalie con your open aai Forum Community architect i' like to begin all of our talks by reminding us of open ai's Mission which is to ensure that artificial general intelligence benefits all of humanity to conclude our speaker series for the year we're hosting one of our favorite all-time guests Professor teren Tow and two of my very inspiring colleagues at open AI Mark Chen and James Donovan Terence ta is a professor of mathematics at UCLA his areas of research include harmonic analysis pde combinatorics and number Theory he's received a number of awards including the fields medal in 2006 since 2021 ta also serves on the president's Council of advisers on Science and Technology Mark Chen is the senior vice president of research at open aai where he oversees Advanced AI initiatives driving innovation in language models reinforcement learning multimodal models and AI alignment since joining in 2018 he has played a key role in shaping the organization's most ambitious projects Mark is dedicated to ensuring AI developments benefit Society while maintaining a focus on responsible research finally James Donovan leads science policy and Partnerships in global Affairs focusing on how our models can be best used to accelerate scientific research and commercialization he came to open aai having been a Founder VC investor and partner at convergent research where he helped launch multiple moonshot science organizations including lean fro and automated theorem Pro prover for complex mathematics please help me welcome our special guest to the open AI Forum hey fantastic thank you so much nassie I really appreciate the introduction is yours James thank you so much what an honor to be here with such great minds tonight uh before we get going I do just want to give a big thank you to Natalie and team for organizing all of this it's no easy thing to get so many people together and run it as smoothly as she always does H it's a great honor for me specifically to be here to talk to you both so thank you for finding the time um and just as a sort of General note though this is the conclusion to one year's Forum events it is the beginning as always of the next year running where we'll have a theme focusing on science and how our models intersect and accelerate science hopefully safely and equitably for the wider world uh so to get going both I wanted to start just by getting a sense of maybe first Terry and then you mark what are the most interesting questions that you're focused on in your R to Phils today and why is it important that we try and solve those questions um okay well uh okay there's lots of technical math questions that I would love to solve um I think um more relevantly for this meeting I mean I'm really interested in how we can just rework mathematics from the ground up and how we can use all these new tools to um really collaborate in ways that we couldn't do before to do mathematics with a scale we couldn't do before um and um I think it could be a new age of Discovery um like right now mathematicians we work on individual problems at a time uh we spend months working on one problem and then move on to the next um with these tools we could potentially you know just scan hundreds or thousands of problems at once um and do really different types of mathematics so I'm really excited about that possibility any you mark cool yeah I mean um one of our big focuses over the last year has been reasoning so you know since GPD 4 um you we've kind of shifted our Focus slightly I think gp4 for all intents and purposes is a very smart model contains a lot of Ro knowledge but it's also stupid in many ways too it gets you know tripped up by simple puzzles and oftentimes very reliant on the prior like if it has some kind of prior knowledge of how a puzzle should shape out um it often kind of makes that same kind of pattern matching mistake um I think these pointed to us really to a deficiency in the models ability to deeply reason and so we've been focused on developing what we now see as the O Series of models so these are models that are more like system two thinkers than system one thinkers I think uh they uh less often kind of give the intuitive fast response and spend some time kind of reflecting on the problem before uh producing a response I think um just to highlight two other problems that are key to our research agenda uh data efficiency is certainly one of them um I think we care about how to injust all of the data in the world including non-ex data um and third is um a very practical problem just how do we create you know in intuitive delightful experiences for our users yes it's true I mean that last problem is maybe a little beyond

Segment 2 (05:00 - 10:00)

the world of math specifically but is a critical one that kind of human computer interface question here I do Terry want to ask you specifically about the o1 models as Mark as outline them but before I do you just mentioned a potentially new type of maths at various times You' spoken about you know mass at industrial scale you've also talken about different ways of cooperating in maths would you mind unpacking us that for us a little bit sure so like math has always has been perceived as like a really difficult activ um and it currently is um for many reasons but one of which is that um it's relying on one human or maybe a small number humans to do a lot of different tasks um to achieve a complex goal you know if you want to make progress in mathematics you have to first come up with a good question um and then you have to find the tools to uh to solve it then you have to learn the literature you have to try some arguments you have to do the computations you have to check the arguments make sure it's correct um and then you have to write it up in a way that you that can be explained um and then you have to give talks and you have to apply for Grants and there's lots of other um different things you have to do um and it's these are all kind of different skills um and you know but in other Industries you know we have division of labor you know you don't rely on one like if you're making a movie you know you don't have one person you know produce the movie edit the movie Act in the movie and get financing for the movie and so forth right you have different roles um but we've not found a way to decouple um all these tasks in mathematics um until recently I think now that we have these tools in principle you could have a collaboration where one person has the vision um one person or maybe an AI does the computations and then there's another to um you know writes the paper and so forth um and so you don't need to be you don't need one person to be expert in all aspects um so I think a lot of people are discouraged from doing mathematics because they look at all the different things that the list of a checklist of have to do to be a good mathematician and it's really daunting you know but maybe there are people who are good at looking at data and inspecting patterns and then asking an AI to check you know can you confirm that this pattern exists um you know or maybe they they're not very good at finding the right questions to ask but uh they can work on some very narrow specific piece of a larger project um so I think these tools areow um the double do my than to be decoupled in many uh to be made more modular and so some tasks be done by AI some tasks by by humans some tasks by maybe like formal proof assistance um some talk by the general public um we have you know we have big S we have citizen science in other disciplines you know we have you know the amateur astronomers who discover comets or amateur biologists who collect butterflies um and we don't really have a way of utilizing amateur mathematicians outside of some very small um sort of Fring projects um so there's a lot of potential and I think we have to throw a lot of things at the wall and see what sticks Terry I have a quick follow-up question for you um I'm curious like AI aside like uh what the maximum number of humans up to date like uh that have been kind of able to collaboratively work on a single math you know project or um yeah do you think there's like an upper limit here you know what right so in practice the limit is around 506 um it's it's really hard past that point you have to check each other's work and um uh also just that getting everyone in the same room and so forth um there are a small number of projects which have many authors uh but um for example proof formalization projects um where a big proof gets formalized that's a task which already that's one of the few tasks mathematics that we already do know how to crowdsource and split because um you know you run it all on GitHub or something thing and um uh and all the contributions are verified because they they're in this formal language such as lean um and so these can have 20 30 authors um lean has this thing called math lib it's Li of all undergraduate mathematics it's never been officially a research project but I think technically it has like thousands of or at least hundreds um so um yeah but it's only really in in the um you know formal mathematics so really getting seen the large collaboration so far fantastic and I do want to Echo that shout out to lean they're doing some really incredible work and I think we might have a few members of the lean team on the call today um as you were unpacking that Terry it sounded like your default assumption was that uh humans will still divvy up tasks they'll still understand enough about the process to decide who's doing what where my first question for you would be um do you think therefore there'll be different roles that emerge for mathematicians different Specialties

Segment 3 (10:00 - 15:00)

that they adapt and I'd flip it over to you mark to say whether you think that's likely to always be humans or you see a world in which o itself is breaking down problems or the oass models right so I see software engineering as kind of a template for where math might go you know so you know in the past maybe there's one heroic programmer who did everything the same way that maans sort of do everything now you know but now you have you know project managers and and programmers and quality assurance teams and so forth um and so one can imagine um doing that right now so I'm involved right now in several projects um which are collaborative um and they involve both a theoretical math component and a um a formal proof of component and people are also are running various code algorithms and so forth and um and it's already specializing the way that I was expecting you know so there some people who don't know the math but they they're very good at at formalizing theorems it's almost like solving puzzles to them um and then the ones that are good at at um at running GitHub and doing all the project management and just making sure that all the backend run smoothly and then there's people who do data visualization and so forth um and we know we all coordinate um so far it's been mostly humans and sort of more oldfashioned AI type like e the improvers and often just running python code or something um but um I think it's it's a paradigm which would where in which AI would slot very um very nicely in once they get good enough yeah no uh that makes a lot of sense to me too I feel like um you know today I all almost kind of treat AI as a cooworker um in many respects you know um there are things that I don't do very well um that you know I can Farm off to an AI um I'm only kind of conjecturing here because I'm not a mathematician but in terms of you know where AIS might be you know strong in helping to solve mathematical problems could be first just like recognizing patterns right they machines are fairly good at this um especially if there's a lot of kind of data or just a lot of you know stuff to sift through um and then I think from identifying patterns you can uh start to form conjectures um and you know I think uh they might have a unique strength in doing that um again coming up with proof strategies I think DET teren this is something we talked about last time um you know I think humans today still probably like um have a better intuition for you know what the right steps forward are but maybe of blind spot you know when it comes to one particular step and um yeah I think last time we mentioned you know there's like some generating function approach that a model suggested in one of the kind of toy problems you were trying to solve and that actually turned out to be like not a terrible idea in that situation um also maybe like um yeah just like verification you know um models might be able to kind of verify certain steps that you like pretty sure are right but you just want to get another um another pair of eyes on and um maybe also kind of like generating counter examples too like I think um if there's something where you know you just want to think of a lot of potential like uh ways that a theum could be false or something um a model may be able to exhaust that uh a lot more efficiently than that makes a great deal of sense you've both mentioned in your answers the role of theorem provs and formalization more broadly is it fair to say that you both think that is a necessary intermediary layer between doing the maths and using llms or equiv Technologies for largely yes um I mean uh the proof has to be correct uh and the thing about math proofs is that if you have a 100 steps in the proof and one of them is wrong um then the whole proof can fall apart um so uh and AI of course you know it makes all these mistakes um you know there are types of mathematics where a positive failure rate is acceptable um like Mark said you know like finding patterns finding conjectures um it's okay to run to have an AI that is only um you know correct 50% of the time if you have some other way to check it um and in particular if it want if it tries to Output an argument um it's a very natural um Synergy to force the AI to Output in something like lean um and then if it compiles great if not it sends back an error message and it updates its answer um people have already implemented this and they can prove kind of maybe short um proofs on the level of an undergraduate homework assignment can be sort of done by this iterative technique um definitely not at the point where you can just ask it like a high level math question and it output a huge proof I mean okay Alpha proof can do it with three days of compute but yeah know but it doesn't scale um yeah for some soft things um where a positive error rate is acceptable um you won't need the formal proof assistance but anything really complex where one mistake can propagate um it's basically

Segment 4 (15:00 - 20:00)

indispensable I me pleas finish yeah I think at opening ey you know in various times in our history we've like focused more or less on um formalized uh mathematics and um I think today we do a little bit less uh I think primarily just because we want to explore reasoning in full generality and uh we do hope that kind of like uh reasoning that you learn in fields of computer science you know are fair Fairly similar to reasoning that you learn in fields like math so um yeah definitely understand the advantages though of doing foral mathematics I I'd quite like to come back to that architecture of the theor proen maths Ai and see whether or not that's true for other domains of science as well but one question I have before that is that even in the training process there's probably a lot of incorrect ways of solving things that don't get into the training models as because mathematicians on the whole don't publish andc correct things and that's true for science more broadly do you both think that would make a big difference is that a sort of cultural norm that we should be trying to push that people do publish failed answers um separately from a I think that's a good idea um it is hard to to encourage that it's uh um yeah I mean people uh don't like to admit their mistakes um but yeah this is maybe really this could really be press training data um for AIS that um you know when I teach my classes sometimes the um the classes I get that are most effective accidentally when I um prepared a proof and I give it in class and I screw up the proof does not work and I have to fix it in real time um and the class sees me try various things oh okay what have I Chang this hypothesis might try work this example um and I've gotten feedback later that those were the most valuable classes that I have I talk and it was because I made mistakes um and this is data that largely I think the uh um you know like people like you they just don't have access to you know I mean this is I think in fact many experts in a domain have the expertise built on Decades of mistakes that um taught them what not to do the negative space um there's beginning to be I think um as we move to a more formal environment like um so right now we are formalizing sort of proofs after they're done um we will eventually get to the point where we will formalize as we go you know we will maybe converse with an AI while we think about math and we'll try to formalize the steps as we go um and then maybe it doesn't work and you have to backtrack and so forth um and that will sort of naturally create some of this data that we don't have right now um out of Interest a lot of mathematicians talk to the beauty of a theorem and the kind of Ure moment when everything fits together and can be expressed elegantly is there a chance we lose that kind of cognitive process we're using tools like these um I think uh well a similar situation came up when um calculators became ubiquitous right people who said you know now that you don't have to doic by hand you lose your number sense um and you know to some extent this is true um you know I would imagine that a mathematician from 50 years ago is much better with um uh getting number sense from direct calculations but you also get a different type of number sense from just playing the calculator um you know and um so I think um there'll be a different type of beauty standard I think there will be some um um computer generated proofs that are also really uh you know really elegant and amazing in a different way um but um I think well I don't think the uh the AI Paradigm will take over completely for many decades I think mathematicians are somewhat slow to you know we still use chalken Blackboard you can see on you know um so there'll be people who who will uh still craft you know really wonderful proofs you know I think there'll be a class of mathematicians who will um take um AI generated mathematics and convert it into something uh much more humid um I think that would be a common um thing to do in the future out of Interest Mark when you hear an answer like that from Terry do you put a lot of thought into thinking not just how do you make reasoning high quality so accurate but also how a human can work with the kind of outputs and that side of the equation too yeah so um I mean I think when you think about RL right it's like uh also just kind of um incentivizing the model and having the model learn from its mistakes so I that like highly resonated with me um yeah and I do

Segment 5 (20:00 - 25:00)

think that's how you develop you know robust and strong reasoning skills right um You can't really just kind of be shown a lot of examples of you know accurate reasoning because there's so much negative space in um in mathematical reasoning um I think I do think you know uh models will become helpful much more useful like I'm quite an OP Optimist on this um and in terms of kind of like the impacts right um yeah it it's really kind of interesting to hear about you know um not so much that like people will lose a sense of like Aesthetics or intuition but maybe like kind of develop new abstraction layers and kind of a uh new abstractions and intuitions will form out of that and um yeah that seems interesting and quite likely as well so yeah no that'll be cool to see and uh especially if it happens fairly soon yeah it's a really interesting uh line to follow just in the sense of I think in my own world of biology the Assumption tends to be these models will find patterns across things that were otherwise seen as unrelated and you'll find all these underlying uh Unity across things but that that's kind of on the idea there's lots of lwh hanging fruit we just haven't noticed whereas I think for things like maths and parts of physics the refinement is almost in the way that activity is done and that we feel like that might be fundamentally different and I wonder Terry and Mark I whether you think it will have an implication how we educ people in maths and in particular support people uh who are going to do Frontier maths research well yeah know I mean it's uh of course students are already using um you know large language models to you know most obviously help their homework um but you know also get a second perspective on on on a topic um educators are actually you know we're also figuring out how to to integrate sorry how to integrate um lar language mods into our teaching um so like one thing that's become increasingly common is to um present some bath problem or problems some other field give you know G gbt's answer to it and say this answer is wrong please critique it um or you know um or teach or have a conversation with the AI and actually teach it to actually um fix the answer um there was actually one uh um one class where they made a group project where um the teacher handed out um a practice final for the class and they said okay try to train um AI with prompt engineering and data analysis to actually figure out and generating um synthetic versions of the final exam how to most efficiently you know get Teach an AI to to solve the final and they um they did um you know so they um they had they had accept one group to do Proms and uh um you know to do benchmarking and so forth sorry lost my light um but um you know but it also forced them you know in for example to generate all the data um for like um uh to generate um synthetic exams they really had to understand the class material to do it so it was actually kind of an excuse to really delve deep uh and and and learned both the M the class material and how to use these AI tools so we'll find innovative ways to combine these two yeah I guess um kind of some people they point to you know fears right it's like if you have too much of a dependence on AI systems right like uh do your skills erode or kind of do you have like less Insight I'm actually very curious for Terry s on this um but yeah I think while he's figuring out his light maybe I can okay yeah it did give a very dramatic so I quite enjoyed the lone genius in the dark you know out of the cave yeah well it um so what was your question again yeah just like um do you think there's any truth to like um Reliance of on AI tools kind of leading to maybe like uh less kind of skill in general in mathematics or maybe loss of insight or something like that um well it will be a transfer I think we will um use some skills less often but we will develop other skills more often um so there's an analogy of Chess so chess is now essentially a solved problem um but um people still play chess quite a lot um but the way they they practice in chess is quite different now they they experiment with different moves and then they ask a chess engine you know is um is this a good move or not um and so for example the of Chess theory is flourishing um there are lots of century old maxims about what you know what part of the chess board is good to control and so forth that actually being reevaluated now with humans asking um

Segment 6 (25:00 - 30:00)

chess engines various questions and that is a different way of getting intuition about Chess rather than sort of the standard sort of just play lots of games and read lots of of textbooks and so forth um so yeah it's um you know um it will be a shift you know it's a trade-off but I think a net positive yeah I think when yeah people ask me also about just like uh you know what how should they be adapted right to is coming out um I still think like um largely like there there's no need to suddenly like abandon studying any particular subject right um I think really um people should be kind of embracing Ai and just seeing how it can make them more efficient like uh in math specifically right like it could help you with a lot of TDS computations you know if it's um some kind of routine thing that you you already know kind of inside out and you can just have the model kind of carry out the manipulations or something like that um I still think there's just a lot of Alpha and just very deep understanding of a subject um yeah even in machine learning today right the people who are affecting the biggest change are the people who just very deeply understand the math or like the you know the systems right and I think that will continue to you know be a very big lever um also just like um focusing on abstractions I think like humans do have a particular aesthetic that's just tied to the core of mathematics um and um I think because you know uh other humans are judging that aesthetic like um you know models may have a more difficult time kind of emulating that when it comes to you know defining the problem and having taste um yeah and of course you know math is just like uh it's a good skill to have I think it's like very transferable it teaches you kind of robust reasoning and I think people who are mathematicians are just very adaptable in general so um definitely no reason not to kind of invest heavily in math it's an interesting point actually Mark when you talk about the aesthetic of maths um we're getting a little abstract but I it is possible that the way that we conceive of maths is somehow tied to the way that we experience reality as humans and that if you had models doing very sophisticated maths you know we might get to a point where it is um exceeding the ability of humans to verify or even make sense of in our context do either of you see that as a possible future anytime soon and if so how would you react to that well um I mean it actually it's already the case that mathematicians sometimes produce enormous proofs that no one person understands um people we already use a lot of computer assistance um there's there are some proofs that require you know that have like terabyte long proof certificates because there's a massive sat solver calculation or some big numerical um modeling or something um and then there there's also proofs that are built upon like a tower of hundreds of papers in literature and we're taking these previous results as black boxes and no one person understands everything um so we're already to some extent used to this um and mathematic we can cope because it's a um we have this language of abstraction you know and we can sort of um compartmentalize a complex proof and you just need to understand one piece and you just trust that either a computer or human understands the other pieces and it all works out so this will keep happening with this so we will have big complicated arguments where part of it is going to be AI generated hopefully formally verified too um so I mean it's it's a trend it's just accelerating a trend that's already been happening I don't see it as a real phase change yeah yeah um a lot of the worries I have are similar just like you could have some error that you know uh propagates or other people build on top of some result and you know you're just kind of uh building on some faulty mathematics right especially if the volume of um kind of new computer generated insights increases um I mean one thing that we worry about a lot at openi is this more General problem of scalable oversight and the idea is just kind of like uh when a model spends a lot of time let's say like thinking or you know um and it comes up with some kind of like uh you know fundamental Insight that's um you know um it's thought a lot about to arrive at um how do you know that you know the model didn't make a mistake how do you know it's right how do you trust it um and um yeah fundamentally it's just like a very real uh problem that you know it felt fairly theoretical maybe couple years back but I think today you know models do have that capability to solve very hard problems and so you know how do we vet and you know trust that the problem came up with the correct answer well math is the one place where we have a shot because of we have the formal verification um that can also be

Segment 7 (30:00 - 35:00)

done at automated way no indeed and you would hope that progress there unlocks progress across all the other Sciences ultimately right we can find a way to derive from those mathematical proofs down into physics chemistry and so on um T there are quite a few people uh in the room today who are working practically in math for students or otherwise so I have a few very practical questions um maybe not a phase change using AI or AI related tools but there are some cultural elements of maths practically that might change some of the unique things are mass competitions and I know you were in Bristol not long ago uh on exactly that theme do you see like the actual ecosystem of math changing to accommodate uh LMS and if so how um it will it's hard to predict exactly how it will um I think there'll be new types of mathematics that are not popular now because they're just technically INF feasible so in particular experimental mathematics is a very small segment ma is like 95% theoretical um which is unusual among all the scientists in scientists usually there's a balance between experiment in theory um but experiments are hard you have to be like really good at programming um or um yeah but um and your task has be sort of simple enough that you can automate it with a regular piece of software which is with within the skill of a math addition a program um but with AI you could do um much more sophisticated um Explorations um so you might want to so you know traditionally you might study one differential equation but you might ask an AI you know here's um an analysis of this differential equation now repeat the same analysis for the next 500 um equations on this list and this is something that you can't really automate right now with traditional um tools um because you need the uh the software to do some understanding what the problem is um so I think the type of mathematics will change I think we will be um there's already a trend to become more collaborative and that will just um accelerate with AI um but I think to at least for the next decade or two we'll still you know be writing papers and refereeing and doing and you know teaching and so forth I think um it won't be a major change to say that we will use more and more AI in our work just like we're already using more and more computer assistant in our work in other ways yeah and I think yeah uh just a point on the competitions I think I can speak more to programming competitions but um I don't know if they would fundamentally change too much I think um at least most people I know who kind of do that um a lot you know they it's just very fun to do I think um kind of even Beyond kind of you know the technical skills that you gain um and cheating will become a problem that's maybe the one yeah exactly yeah um yeah I mean I think that's also like just a very deep question right it's like um even like how do you interview people when you know um the models can solve very difficult problems um so yeah um but I do think you know contests a big part of the reason people do it is because it's just fun and I think you know um the analogy to chess is a good one yes so cheating is definitely one element of this but I guess the less um deliberate or or um trying to break the rules elements is maybe attribution you know what happens in a world in which we have you potentially large parts of formalization being done by LMS or even novel ideas emerging from LMS because of a combinatorial approach H can you both invis World in which we are attributing breakthroughs directly to LMS themselves and what might that mean yeah this is going to be a big uh issue that we have to face um I think it's um it's already the authorship model that we have of papers where there's like you know um so in the Sciences we have maybe one lead author and then a whole bunch of of secondary authors um and so mathematicians we don't do that yet um we still like order alphabetically by last name and we haven't really uh we have largely sort of ignored the question of who did what and we just say oh we all contribute equally um I think we're going to have to um be more precise about attribution um and papers in the future um so there's already a trend where um at least in The Sciences where you write a paper there's some section on author contributions you know who did what um and if it's a GitHub you can look at the GitHub commits and this also gives you some some data um so um and then maybe there'll be some way to automatically you know inspect data and and somehow summarize um who did what um

Segment 8 (35:00 - 40:00)

yeah so once you know half the commits are done by an AI and so forth uh yeah there was a question like do you actually promote the AI to a co-author or do you at least put the acknowledgements um we don't have the norms for this yet um we'll have to work it out there'll be some test cases and some controversies and will eventually work out something that works for everybody but uh yeah I don't have the answers for that one yeah I do think there's also this um related not exactly the same style of issue of just kind of access I think if um just you know uh models continue to contribute large chunks of proofs like uh are the people who have more access to compute or like uh you know are they in an advantageous position when it comes to doing mathematics um yeah definitely something to kind of think through and um yeah I don't quite know how to like follow that train of thought quite yet um but yeah um it's definitely a hard problem yeah it's going to be interesting to see I mean you already get maybe more on the creative side of the world questions about attribution and ownership um but it'll be interesting as it gets more and more involved in science about intellectual property and how we think about the R& D cycle in such a world on that kind of topic of Applied use of maths or science more broadly for those who are an not of themselves mathematicians uh We've Spoken a lot about the act of math changing and why that's important if we were and ignoring the mechanism for how to achieve that if we were to get to a place where foundational Mass was being meaningfully accelerated what would you expect to see happening in the world what does that unlock for the rest of society um well I think uh um it could increase um citizen participation in in mathematics one could imagine you know for example you people debate about know it's the Earth round of fat it's amazing how this is still but you know like um you know with an AI you could actually start constructing models and you can say okay like okay suppose the FL what would this Sky look like and so forth and um you know um right now you need quite a bit of math before you can figure out like how much things would change um but you can imagine with h these mod tools that it could actually just create a visualizer for you and you can see oh this is what this theory of the universe would look like um and um so I mean I think it would make it could really connect mathematics to a lot of people which who I currently feel excluded from it because of just the sheer technical skill needed to do anything in the subject do you think it is a prerequisite that we get better at doing this kind of maths in order to use AI in other applied uh scientific applications you know is a prerequisite for accelerating engineering or physics or and question for you as well Mark whether you see that as a necessary First Step um well I mean so much of the science is already math-based if you don't understand the math you you yeah you can't model accurately uh without the math um and yes certainly on the back end I mean if you want to train the AI and what you need lot of meth that um I mean it's possible that uh we could be in enter a world where you could be you know a biologist or whatever and you could ask an a to to run a statistical study or something and you don't need to know the fine details of exactly what the parameters are and if the AI um is reliable enough it could actually you know do all the math for you um and so um you know it could make the math optional to do science um in a way that it isn't right now so it could work both ways yeah I mean I think I trust Terry the most on the implications of you know like uh um having accelerated math progress and what that means um I think really as a researcher and just speaking on behalf of a lot of the researchers here I think the most exciting applications of our models are when they're used to accelerate science I think um yeah really like um yeah trying to provide this kind of very general purpose tool that you know um experts can be can use in their daily lives to just accelerate their work um you I think across other Sciences right like we've seen you know people in Material Science people in healthcare kind of already use the reasoning models and have testimonials to the fact of like hey you know this is um almost like you know uh some undergrad that I can kind of give tasks to and they can come with fairly coherent analyses of certain situations or you know um uh kind of like K said like a lot of people will be like hey you know here's a scenario right like uh can you kind of do some calculations and like what would the implications of this scenario look like um and I think people

Segment 9 (40:00 - 45:00)

have uh found it fa effective in those situations no absolutely I mean I suppose where my mind is going is that very rapidly you hit a world in which a very small number of people could actually B it whether or not the answers you're being given are correct and perhaps the structure of theorem proving plus an Ever more sophisticated uh LM in math is the only way if you actually get a scalable verification solutions to that problem and so in a way we always have to have form mathematics at the top and then everything else is derived from it um given that that's a potential future and some of the other themes we've spoken about Terry do you have advice for young mathematicians on where they should be focusing and the kind of questions they should be tackling um I think uh yeah it's m advice is they have to be flexible um that uh I think mathematics is becoming you know more technology infused and and more collaborative um and uh you know maybe 50 years ago you could specialize in one Sub subu mathematics and barely even interact with other mathematicians and you could make a living out of that um and that's basically not uh so feasible now I think you math is part of a much larger ecosystem uh which is a healthy thing um and with AI it unlocks much broader collaborations than um previously thought possible you know you could collaborate with Scientists in a domain which in which you really have no expertise but the AI can can help you know get up to speed at a basic level and serve as sort of a universal translator between scientists um so yeah just be open-minded and uh um you know also you know recognize that these tools also have limitations you know and you can't just sort of blindly use these tools I mean you still have to build up your own human skills so that you can supervise the AI um yeah it isn't a magic wand yes I don't think even we of open AI would be encouraging folks to use it without quite a heavy bit of expertise and oversights maybe a similar question for you mark but slightly barent which is just based on the trajectory that you're seeing what skills would you encourage students to be picking up now to be able to make the most of these models over time yeah um I mean honestly like technical fields we still need technical experts in technical fields to essentially who can kind of synergize with the tools very well um I love the idea just the general advice to stay flexible right like um and to show kind of like AI research a little bit I think it'd be very helpful for people in just a variety of fields to at least kind of understand the basics of you know how neuron Nets work um how they're trained you know what their Dynamics are like and just like what their limitations are right as a implication so um I think um just the more that people kind of play around with it and yeah just understand how you can accelerate them I think the more effective they will be um I do think there will be a kind of multiplier on everyone's efficiency um maybe a couple years down the line right um and that multiple hopefully will be like you know significantly greater than one but I do think people who effectively leverage AI tools will be by and large more effective than people who are just kind of blind to it yeah that certainly resonates I wonder if the key question has become less will they be useful and more the speed of their evolution some ways Terry you've been on the inside uh watching as these models get better at different moments in time and I do hear you know recently the performance on IMO and silver level though accepting that there was a little bit of J going on to make that happen have you been surprised at the rates of progress it uh yeah it's been sort of both exceeding and and also under uh foring under my expectations uh yeah so it seems like in any task for which he can generate data of similar tasks um so you know for example the IMO thing you know Deep Mind generate a lot of synthetic proofs actually fail proof that was actually part of their secret of their secret um so um a lot of tasks which I thought would not have been um doable for several years are now done um on the other hand every time you sort of go beyond the sphere of where there's data and like you you you go into a research level problem where there's like only 10 people in the world have really thought hard about this question and it's still you know the AI tools are still not um being so useful so I had this project I'm still running right now where um I instead of proving one big problem I that's we're proving like 20 million small mathematical um questions and I thought this was a task

Segment 10 (45:00 - 50:00)

in which AI would be ideal for you know because if they could handle some percentage um but uh it turned out that you know uh all these questions that that this project studied you know maybe 99% could be handled by kind of more traditional computational Brute Force methods and 1% which was quite hard and quite human intervention um the a to that have been tried it they could re they could recover much of the 99% of so the fairly easy problems but they didn't really contribute to the hardcore of the um the really challenging questions so that could just be the nature of the state of the technology today um so it um yeah I I I don't see the um there would have to be quite a few more breakthroughs I think before um you see them sort of autonomously solving these research level questions yeah I think just speak to one anecdote in my mind that speaks to this kind of like you know impressive and at the same time you know room to go kind of uh angle I think uh you know we participated in the II this year as well with our o1 models and I think on one hand you know it's like uh it did take them a lot of samples per problem I think like uh we announced in our block post like you need 10,000 samples per problem to extract kind of gold medal level performance from the from the model um which feels like a lot better at the same time it's like just incredible to me that it can do this at all um and you know some of these are very like anti-pattern style problems and um and so it's like somewhere in there um and I think I'm just really excited by kind of really getting that capability out yeah it's funny that it always feels a little intellectually unsatisfying when you feel like you've almost cheated in a way because you've reconstructed the problem but then I zoom out and I wonder how much of scientific progress is just lots of that stacked together and then it creates a paradigm shift that in retrospect seems very clever but was actually just little things together to some degree you know the joy of programming is exactly that when you redefine a problem such that it can be solved first necessarily first principles working your way through it does raise the question for me though which is um maybe what we're talking about here is that uh we're teaching the models to reason in a specific way and that category of reasoning works well for some type of problem do we think that and maybe starting with you Mark and then on T do you envision a world in which one class of models does lots of different types of reasoning simultaneously or is it more like to be a world in which you have sort of individual models doing different types of reasoning that comes together um and then for you Terry what kinds of reasoning would you need to see to think that you could unlock using AI as some of the more challenging the smaller subst set of questions that currently they struggle with yeah I mean I do think there's Beauty in just having one model that can kind of Reason across a bunch of different domains I think um when you try to hook up a lot of complicated systems like uh you make a lot of design choices and I think you know Simplicity is really um one of the key mantras in AI development um I do think like uh yeah you could set up structures um of course of you know AIS that uh collaborate in a certain way and um that's also very exciting right like um you know could we build out this model of you know like uh you know you're a specialist here and you're the PM of this math project and you know you're the proof writer um and you're checking like the 10,000 cases or something like that um yeah I think like that that's also very interesting paradig Tak for right well I mean I um I definitely see AI problem solving is a very complimentary way it's a very you know a very data driven way of solving problems um and as you said you know for certain tasks it it it's actually much better than humans a lot of perceived diff what we're learning actually is that our perception of difficulty of certain tasks has had to be recalibrated um because um we just didn't try to use a data driven approach to solve certain class of problems um but some problems are genuinely hard without um I think I mean you know in math there even questions that are undecidable you know that no amount of data can actually solve certain problems we can actually prove that they can't be proven um but um yeah so I think I mean um this is not really the AI strength but you know if you want an AI to really compete on solving math problems the way that humans would they need to reason in data scarce environments where you know there's a new mathematical object that you're studying and you know five or six facts about it you know some a small number of examples um maybe there's a very vague analogy with some other mathematical object that's already out there and you have to just extrapolate from very small amount of of data what to do

Segment 11 (50:00 - 55:00)

next and this is something that AIS don't Excel and maybe you and it's entirely sort of the wrong I mean I think trying to force as to do that is like it's using the wrong tool to you know um to to achieve a task you I mean this is the something that humans are actually really good and efficient at it's all the boot Force checking and case analysis and synthesizing you know finding the patterns that they're not so good at um so I don't know I I it may be a mistake to think of intelligence as sort of onedimensional scale and which one's better AI or humans I think they really you should think of it as complimentary yeah I do hope if we're successful in our um research program that will have very data efficient reasoners too so um hopefully we can prove you wrong Terry okay glint in Mark's eyes as you were talking I see itching to jump in there okay well I would love to be proven wrong yeah um we're coming up to the end of our time both so maybe to end this and as a way of tying this all together um if you were both you know tomorrow appointed to be Vice Chancellor of University given some meaningful budget uh what would you set up to make an effective in your case Terry Mass Department in your case Mark maybe broader science department and what infrastructure would you be investing in uh to really take advantage of these new technologies that's a good question I um I can imagine having some centralized computer resource to to run local models that uh you can you can tune yourself and so forth uh it's a little hard I mean um the technology is changing so fast that an investment in any specific Hardware or software now may not be so important uh in a few years um yeah so um some uh certainly um some um location where you can bring together lots of people from different uh disciplines uh and to figure out ways to use these Technologies uh together I mean we're already developing lots of have these Tech Hub type things already um so yeah but I think it has to be very free form and because the technology is so um unpredictable uh just but yeah we need the different departments to talk to each other and see where the synergies are do you see a room for sort of concer after around math libraries and those kind of building blocks for they improving or things like that um yeah so I mean there's already um yeah there s volunteer crowdsourced efforts right now um the federal funding agencies in the US are just beginning to fund a little bit of this um it's not so soorry universities generally have not done this kind of fundamental infrastructure type work um yeah that may be a role where um actually I think a government will have to to play leading role and for you mark yeah I mean I'll just give a very short answer I think uh opening eyes is doing it right um you know build a very big computer let's figure out how to turn the computer into intelligence so it's a py answer and one I think Sam would be proud of too Mark so that makes a lot sense well guys I just want to say thank you so much both for finding the time to talk to us today we will be moving from this into a Q& A so anyone who had more difficult questions for you both will get that chance to far them away um but Terry in particular thank you for dialing in thank you for giving us the time for this conversation and with that I'll pass back to naie thank you so much fellas I'll see you in the Q& A so everyone if you would like to ask Terren James or Mark your questions live please join the live notification link that just popped up or you can go to the agenda tab on the left side of your screen and jump into the Q& A meeting room I'll see you there in a second and we'll address all of the questions or as many as we have time for that you all dropped in the chat see you soon Eduardo let's get the party started with you would you like to introduce yourself Eduardo uh yeah Eduardo s I'm a mathematician by training working now and also doing AI about 50 years ago uh literally 52 actually um but um my question for Terry um so 40 years ago 35 or 40 years ago I officially asked the American math Society through Felix brother who was a colleague of mine at ruers at the time um to uh propose a big scale mathematics project uh similar to the physicists were having their super collider at the time and I said

Segment 12 (55:00 - 60:00)

let's computerize let's form a database of basic mathematical theorems in some sort of unified language so that people would be able to refer to those things and find them easily I was left out of the room I was like this guy is crazy you know uh crackpot but um obviously now we're in a situation where this can begin to happen so my question to you and I post this in the questions was uh for me the most frustrating thing when doing mathematical research is you're trying to prove a little Lemma and you know that a hundred people must have proved this whether in algebraic geometry or you know commu algebra group Theory uh you know pdes whatever and it's so hard to find the answer right you end up proving it yourself so my question to you is do you see in the relatively near future meaning not 20 years from now but maybe three four five years from now a capability for being able to you know through some kind of learning right and could be some sort of you know U attention based type thing where you recognize patterns by what is embedded what's related to what that would be able to really do this right and you know exactly what I'm talking about right sematic search for math would be fantastic um oan actually already does a little bit of this um I did some experiments like if you have a theorem that result that you heard you think you know the name of or roughly what it is but you don't remember the name so you can't just type in the search engine you can describe it in informal terms to LM and it can often actually say oh you're thinking of of of this particular theorem um for a more obscure result which is buried in 20 papers on the archive somewhere we don't have that capability right now um that is a great problem I I pose it to a lot of people who I talk to in machine learning you know is there some way to extract out the essence of a mathematical um result and search for it right now the best way to do it is crowd sourcing you go to a question and answer s like a math overflow um right that works yes right but like we right now basically right okay thank you Eduardo so good to see you so Lizzie you're gonna be up next but before we jump to your question we're going to let our technical producer find you and unmute you and we're g to give a question to Terrence and mark from Neo sengupta Chief privacy officer at Robin Hood neoy asks Terrence what's your gut feeling about hard constraints if any that these models currently have and will continue to have when it comes to solving previously unsolved mathematical problems hard constraints are remarkably few um I mean there are a few questions that are just genuinely undecidable and then there are ones which we know imply other questions that are hard um and we know that they're kind of immune to a lot of standard techniques um but there's always surprises I mean in human mathematics we every year there's a problem which people thought was impossible and some human came up with an ingenious new idea um so that's the beauty of math we we don't actually know what's hard um so yeah there there's very few hard constraints I say Mark anything from you yeah no I largely agree with that perspective I think heart is a very strong word um and um I mean there's certainly um I think aspects of mathematics which are difficult for the models today like um just like asking the right questions you know having an Aesthetics for like what um you know what abstractions to build or something like that and um yeah I think they're much better in this kind of like you know ask a question and try to solve um setting thanks Mark Lizzie Welcome to The Forum would you like to introduce yourself yes um so I am currently a medical student at Stanford um on the Neuroscience which is the real neuron network if you um don't mind you call it um but I'm trying to apply uh using the llm or the AI model that I'm still learning to apply for AI drug Discovery but I don't have question for that because it's too many questions regarding that issue my question is I run into a technical issue that I live in San Francisco and I wanted to go to the San Francisco um Opera this weekend which is uh the past weekend and I Ted in into the chat gbt and asked them um when is Carmen on show because that was the K schedule and then chbt told me on Saturday I can go so I went there was no show it was only Sunday uh 2: p. m. so with this technical difficulty

Segment 13 (60:00 - 65:00)

then um how can I trust or use the system in a more how do you say cautious way when doing AI drug discovery that you know I don't know the answer to then I cannot check and then it will have longer impact I'm sorry to bring this issue oh no of course um it's a very fair question and um I think I probably the person who should answer it um I think um actually I would encourage you to try to use the models with search today um I think there's uh existing ways you can have the models kind of just browse and ground the model responses in our output sources so if you use search today it will site you know particular websites or particular sources which reflect ground truth um I think future versions of this will be extremely precise they'll like tell you the locals you know within these websites where you can find the answer and find the reference for yourself um yeah and I do think kind of future models um they will be very grounded in this way like um you'll be able to um exactly kind of Trace where's this kind of you know ground truth nugget where it got um particular piece of information but I would today um encourage you to try the same query with search enable I did use the o1 that I paid something so yes o1 is not a search enable model I will ah okay then can you explain what is the sech yeah there is an icon um if you kind of you go to chat P4 um I know it's very confusing today we will unify kind of and make everything much more simple but uh there's a globe icon um and uh it essentially uh enables the model to search the internet for um for results Mark you've got a very promising career in Customer Support Lizzie thank you so much for your question so good to meet you see you soon so next on Deck will be Daniel mcnea for a Live question and while we queue him up I'm gonna ask a question from Ahmed elgamal founder at playform Ai and professor at ruers University and I just want to share with the community that Dr elgamal is like one of the pioneers of AI art and he has a really beautiful presentation that we have archived in the forum for anyone who um hasn't met him or wasn't with us for that presentation his question is what do you think is needed to go from where we are now where AI can solve math Olympiad kind of problems to the point where AI can solve PhD level math problems I think Mark or Terry either of you can take this one right um I think it depends on whether it's with Human Assistance or without human assistance I think if if it's human supervised it can certainly help um it can already do a lot of of uh the more Minal task tasks um in a math project um I think it's it's as I said before it's missing a lot of the strategy strategic planning what to do when there's no data to to tell you what to do um and I'm not sure how to get past that um other than to have human supervis human experts so far yeah I think at a meta level just kind of zooming out um I do think like if you looked at kind of how self-driving cars um have evolved right um like what when do you get to the point where you can kind of like trust the car to take you from point A to point B without supervision um and I think really the underlying progression wasn't magic I think it just more and more reliability over time um and you know you just like uh you start with you know it's like 90% accurate in making decisions then 99% then you know 99. 9 and of course like you know you never like there no guarantees that the C make that you know that they're always going to succeeded or not going to make errors um but I do think kind of just the you know amount of Direction and supervision um will probably you know shrink over time you can trust the model to do kind of more uh self-contain tasks um that require more just kind of longer trajectories of thought by on its own and it'll become more and more reliable at that just to jump in on that I think one era where this starts to get really fascinating is um for things like physics and maths where some of the answers at least are acatic and you can go first principle down you can see how longer training cycles and improved reasoning models get you to these answers but I think about applications in biology where there's a huge amount of redundancy uh and it's some combination of probabilistic plus first principle plus contextually determined and you wonder whether that requires a different approach whether that's also going to be amable to this generalizable first princip approach and as we explore those kind of things I think you start to get quite interesting

Segment 14 (65:00 - 70:00)

insights about what needs to be true to solve these interlocking interdependent problem sets versus kind of classical top down problem sets thank you James Daniel so good to see you yeah great to see you too welcome I think um last we spoke a couple of years ago you were wrapping up your PhD so would you like to let us know where you are now introduce your to the community yeah hi everyone I'm Danny um I did my bachelor's in math at UC Berkeley and then um up until about six months ago I was a PhD student in like AI for science at uh University of Wisconsin and now I'm actually in law school um working on like Ai and Law related topics so kind of done a bunch of different things but um yeah my question for Professor ta was um I know like historically uh sort of the math has the or the math theory has developed first and then um researchers in other fields like physics especially or you know chemistry or other domains will take that theory and apply it to their problems um now with AI um being such a big thing do you see any feedback going the other way um like I know in physics people are using machine learning a lot for um simulating like computational Solutions of PD and stuff that you can't solve using traditional methods um do you see like mathematicians gaining any new insights into Theory um from those other fields especially because we can just generate a lot more data yeah no I mean mathemetics has always it's always been a two-way street um I mean um there have been discoveries by physicists um that uh um mathematicians didn't have an explanation for and then they had to develop theor of mathematics you know um you know D invented something called the D Delta function which wasn't a function according to Orthodox mathematics and we had to to uh to enlarge our notion of what a function is um it's always gone two ways so I can imagine a very practical science-driven application maybe powered by AI discovering some new um phenom that just cries out of explanation um and it's it will be discovered empirically and then um and then mathematicians will be motivated to find theoretical explanations um so it's always been a two-way street between theoretical and applied sciences awesome thanks nice to see you Daniel okay let's please ceue up Ashish baa and while we do that take a question from the chat so Terrence I think this one is for you and I hope I can pronounce all of this accurately can Universal approximation theorem be dethroned a recent paper on col mov Arnold networks has gained a lot of hype what are your thoughts okay I don't know this particular uh result um I mean universal approximation theorem tells you that any that any operation can in principle be modeled by new network um but it is just a pure existence the it doesn't tell you to find this newon network it may be too impractical to actually use but it does tell you that in there is no theoretical obstruction to uh to having neets so very complicated problems as opposed to say perceptrons which do not have a universal approximation property um yeah uh in general that the this the whole theory of machine learning is really uh lagging decades behind the practice um we do have a few sort of Bedrock theoretical was like the universal approximation theorem but we don't have a like um we don't really have a good explanation of why nees work as good as they do for some tasks and why they're terrible for other tasks um yeah so yeah has there's certainly a lot of theoretical work that has to be done thank you Terry Ashish Welcome to The Forum would you like to introduce yourself thank you Natalie uh my name is Ashish B I work at Microsoft as product manager and I build no code platform for um AI um so my question is I actually want to describe a workflow that I use to um write things to do um stuff at my work uh which is U I use 01 for deep thinking of stuff any topic that I'm kind of working on and then I use uh foro to do research and then I finally and these are different tabs on my browser and then I finally use uh foro with canvas to kind of put it together all of that right um so this is kind of human curated workflow I'm trying to figure out if there would be an easier way to

Segment 15 (70:00 - 75:00)

do this in the future very good question um I alluded to this a little bit in a previous answer but um yeah the there's so many models and part of the reason that um it it's confusing today is Owen was always kind of meant as a research preview um and you know uh we just wanted to Showcase kind of uh more advanced reasoning capabilities to the world um we will kind of make it a lot less messy I think uh want to integrate everything together make it very seamless and um I think that'll provide a much better experience for you so um yeah uh again it's hard to promise a date on this but I think your workflow will become a lot simpler so thank you okay can we please queue up W A and W you're gonna have to tell me how to pronounce your name please forgive me if I was just inaccurate there and then we'll take a question from the chat and this is from Michael skyba software engineer at insero XS given the capacity for collaboration among humans could the diversity in having multiple models reasoning together elicit greater creativity in proofs that a single model would fail to reach and maybe you can kick this one off Mark yeah um yeah I mean I think that's a very reasonable hypothesis I think like um anytime you have kind of mult multiple agents in a system where um you know maybe like the agents have different incentives or you kind of create some kind of environmental Dynamics between them you can get very kind of interesting Behavior right um and I think um that certainly should be the the case for our AI agents as well um yeah and I think like one particular instantiation of this could be like maybe they just end up specializing in the ways that maybe ter has described in the past right like one kind of becomes a product manager and when kind of becomes more of an Executor so you could kind of uh imagine them kind of developing specific um specific roles um so yeah I mean is it guaranteed that this kind of specialization will outperform like a single you know very powerful thinker on its own um I think it's still a little bit unclear um but certainly very interesting to explore yeah I don't know Terry if you want to add something on no I think we should try all kinds of I think it's good have a very diverse uh set of uh um approaches to to solve these problems um I think problems where there there's a really well defined metric that you trying to optimize um work they're having a team of competing um AI trying to optimize this Benchmark where I could probably do better than a very vaguely defined task where having too many voices may actually make things just harder to to manage W welcome and if you unmute yourself could you first please tell us how to pronounce your name so that in the future I say it correctly oh hello how are you wonderful can you hear me we can hear you loud and clear oh okay I'm hearing the music right now and I can't hear the audio feedback for some reason maybe you have two tabs open I'm reading your lips refresh the tab I think you have two tabs open but yeah I would refresh the tab so w how do we pronounce your name actually okay let's go to a question from the tab and we'll get back to anite while he figures out his audio okay so this is from anid kashik Executive MBA at Wharton and Google ai explainability Still Remains an area of research requiring additional investment where can mathematical theory help in formal characterization of AI from a system standpoint yeah I think this is an area where actually the theory is very behind where it should be um I mean we have some uh difficulty with ours that that show that uh yeah at least current models I mean it actually is provably hard to actually un to to uh to sort of unpack given a model exactly what was the route taken to get there um yeah the current architectures are not designed at all to to do this kind of tracing um I mean it should be possible um but I mean there' be a trade-off I mean it'll come at a huge hit in performance and training and so forth so I mean there there's a reason why we don't do it right now um people are beginning to do sort

Segment 16 (75:00 - 80:00)

of post statistical analysis on a model like you can take you can take a new and maybe like Turn part of it off or swap it with swap part of it with with something else and you can start sort of seeing what parts of the network were the most critical in arriving at an answer um but yeah it doesn't really have it's still kind of empirical uh we don't really have a have a good robust theory on this yeah I largely agree with that I think um interpretability today is a very empirical science um and you there's like a lot of mechanistic interpretability techniques that are effective at you know like identifying like sub networks or um you know parts of a network that are responsible for certain things or uh but yeah it is hard unless you kind of like architecturally bake into the model in a certain way like uh to just be like oh you know this was why a model did a certain thing thanks fellas let's bring AI Taj to the stage please and while we queue him up here's a question from Belinda Mo a master student at Stanford what are the neural network architectures and data set formats that you find most promising for theorem proving especially related to lean's math lip uh I don't know if there's any specific architecture that is better or worse uh I think we actually have to do empirical studies actually just you know I mean um I think we need to create data sets of thousands of theorems and just test different architectures and see what happens we don't have a theoretical prediction of what ones are going to work right now yeah um very much true I think in some sense like there's been a convergence of architectures on the language modeling side I think like oftentimes they are variants of Transformers I think you know today um you know they're kind of um people exploring you know what the next generation of Transformers might look like so things like potentially SP State space models you know um but yeah I think it's the drw still out on like which is uh specifically a very good the of improver right one thing that's been surprising I think in AI is that like domain specific knowledge has like AI that try to incorporate domain specific knowledge often don't out general purpose you know the bit of lesson right um yeah so it's a um yeah it's still posi why that's true actually yeah I mean in some sense like um the architectures that most leverage and the underlying Hardware like maybe it's a statement that you know that just is like a bigger gain than like any kind of specific engineering that you could do we were having when is having a technical issue so I'm going to take another question from the chat and this is from Sheeran sha director at visual academics any advice to researchers from other fields who want to collaborate with AI and mathematical problems despite not being trained mathematicians what are the common pitfalls and how to avoid them um I think there isn't enough of a uh a body of successful examples to uh to tell what the common pitfalls are um I think um so um the project that be working that will be well suited for this Paradigm in the future are really large collaborative projects you know run out of a GitHub where um U the task a task can be broken up into um into lots and lots of little pieces some of which may require math expertise some require expert other science someone require facility using AI but you don't have to be expert in everything um so that's just beginning to happen I mean I'm I'm running a pilot project right now to trying to see if that kind of thing is possible um but there um it's certainly not so many of them that you can just sort of sign up for one right away I mean it's maybe like three or four of these things right now floating around um I think you would have to connect with a human uh with experts in book you need to find the right collaborators um you know expert maeti an expert in Ai and so forth um and uh I think you need a lot of sity I think right now you can't just um start a project by yourself just because there's so few models to to uh um to figure out how to what I do um but you know um I mean we need to experiment and try lots of things and see what sticks thanks fellas and last but not least can we please cue up Jordan Jordan's a longtime Forum member been with us from the beginning he has a unique perspective he comes from a background in Google but is also a marketing

Segment 17 (80:00 - 85:00)

professional so I'm excited to hear from him you're making this brown man blush thank you very much Natalie and you've done a great job this year amazing Forum great guests everything's fantastic Caitlyn too uh mark thank you for pres uh presenting uh thank you Terren and also thank you James just wanted to ask you mark what are some awesome use cases for 01 that you're seeing people not talk about that you think should get more love thanks yeah um really good question too so I think um uh there is this kind of misconception that you know reasoning is only kind of in math and coding um and you know a lot of the use cases that we've seen um have really showcased reasoning across uh you know diverse domains right and I think like um like in linguistics right actually like o1 can really unpack and you know help with um you know like understanding like Linguistics or even like linguistic puzzles um and um like kind of like you know uh breaking like ciphers you know like getting uh Discerning patterns from data so I I do think like U I would challenge you to kind of look at use cases outside of um just pure math and coding even though it's certainly excels at that um but kind of see reasoning as something that's like uh very general and broad-based um and yeah I mean I think like uh another example is you know um uh James and I have uh kind of worked on uh Partnerships with you know like Material Science organizations or you know other external organizations and they found kind of reasoning to be extremely effective there as well thank you Mark anything else from Terren or James okay James did you have something to say I was really just going to EO Mar and say that um there is a tendency sometimes to think unless the models can answer every scientific question perfectly there's no utility you know you either need 100% or nothing and it's a sort of binary question uh but often being able to accelerate smaller parts to Terry's point about Mass more broadly is itself a huge compounding gain uh and very often the impact of science isn't just the theoretical work or the experimental work it's the ability to commercialize or bring that stuff into the real world and we see really quiet transformational gains across teach those things but particularly in that last bucket uh that I hope will you ultimately result in better and more science coming into the world I've heard that big truck drug companies part of their biggest gain from using AI tools is actually um accelerating their regulatory paperwork um you um awesome thank you so much fellas what a beautiful talk to end 2024 with and we'll send all of this to you via email the recording will be published in the Forum by early next week and as James said this really is just the beginning of our deeper dive into how our new reasoning models can accelerate math and Science and so we can't wait to host you again in 2025 thank you so much Dr to thank you so much Mark James what a beautiful facilitation I definitely couldn't have handled that with so much grace I'm so glad that you came and uh participated as the facilitator for this talk and Terry thank you so much it's been I was just thinking um it's been just a little over the years a year since the last time you joined us so thank you for all of the grace and flexibility as we were planning this event and I hope that we can make this a ritual a yearly ritual to have you back was a pleasure a thank you okay fellas well that was our last expert talk for 2024 but we are hosting one last technical office hours December 19th for the community members here who are new our technical office hours are an opportunity for you to meet for one hour with a software engineer a Solutions architect or a solution engineer and get your technical challenges um potentially solved unblock you give you some ideas um that are related to your use case and I think it's a really beautiful opportunity to connect one-on-one with our technical team at open Ai and finally as all the new members are here hearing this for the first time we want you to know that this is your community and you now have the agency to refer peers people in your network um we prioritize referrals from the community so we're going to drop that referral application in the chat and we would love to also incorporate some of your community members pretty soon in the next few weeks definitely by the first few weeks of January we're going to be launching Geographic chapters and interest groups

Segment 18 (85:00 - 86:00)

that means that you'll be able to self-organize you can find people in your area of the world connect with them host coffee chats and I hope this makes it easier for you guys to continue conversations outside of the walls of the Forum and this event isn't over please if you want to meet each other one on-one we're launching another notification you can go into the virtual networking time it'll be matched one-on-one with other members of the community the default is 10 minutes but feel free to cut it short after a few minutes so that you can meet more people with the allotted time and that's all we have for the evening I'm so very pleased to end 2024 on this note what a beautiful event what amazing people in our community I feel so grateful that this is my job and I really love hosting all of you happy Tuesday everybody and we hope to see you soon good night everybody

Другие видео автора — OpenAI

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник