Why was OpenAI surprised by ChatGPT’s success? What does it really mean to “reason” in an AI system? And what’s next for agentic coding and multimodal assistants? OpenAI Head of ChatGPT Nick Turley and Chief Research Officer Mark Chen unpack it all in a conversation that pulls back the curtain on the making of OpenAI’s most iconic product.
00:00 Intro: Meet Nick Turley and Mark Chen
00:40 Origin of the name "ChatGPT"
03:50 ChatGPT’s viral takeoff
07:00 Internal debate before launch
9:40 Evolution of OpenAI’s launch approach
11:00 The sycophancy incident and RLHF
14:45 Balancing usefulness vs. neutrality in model behavior
20:00 Memory and the future of personalization
22:50 ImageGen’s breakthrough moment
29:00 Cultural shifts in safety and the freedom to explore
33:10 Code, Codex, and the rise of agentic programming
37:45 Coding with taste
41:45 Internal adoption of Codex
43:40 Skills that matter: curiosity, agency, adaptability
46:45 OpenAI’s “Do Things” culture
51:30 Adapting to an AI future
55:15 The opportunities ahead: healthcare, research
01:01:00 Async workflows and the superassistant
01:05:40 Favorite ChatGPT tips
Оглавление (19 сегментов)
Intro: Meet Nick Turley and Mark Chen
Andrew Mayne: Hello. Andrew Mayne: I'm Andrew Mayne, and this is the OpenAI podcast. Andrew Mayne: My guests today are Mark Chen, who is the chief research officer at OpenAI, and Nick Turley, who is the head of ChatGPT. Andrew Mayne: We're gonna be talking about the early viral days of ChatGPT. Andrew Mayne: We're gonna talk about ImageGen, how OpenAI looks at code and tools like Codex, what kind of skills they think that we might need for the future, and we're gonna find out how ChatGPT got its totally normal name. Andrew Mayne: Even half of research doesn't know what those three letters stand for. Nick Turley: You know, you're gonna have an intelligence in your pocket that it can be your tutor, adviser, it can be your software engineer. Mark Chen: There's a real decision the night before. Mark Chen: Do we actually launch this thing?
Origin of the name "ChatGPT"
Andrew Mayne: First off, how did OpenAI decide on that awesome name? Nick Turley: I was gonna be "Chat with GPT-3. 5", and we had a late night decision to simplify. Andrew Mayne: Wait. Andrew Mayne: Could you say that again say that name again? Nick Turley: I was gonna be "Chat with GPT-3. 5" Andrew Mayne: Chat. Nick Turley: Which rolls off the tongue even more nicely. Andrew Mayne: And you said that was a late night decision, meaning like weeks before you finally decided what to call it. Nick Turley: Right. Nick Turley: No. Nick Turley: Weeks before we hadn't started on the project yet, I think. Andrew Mayne: Oh, goodness. Nick Turley: But, you know, I think we realized that would be hard to pronounce and came up with a great name instead. Andrew Mayne: So that was the night before? Nick Turley: Roughly. Nick Turley: Yeah. Nick Turley: Might have been the day before. Nick Turley: Was all kind of a blur at that point. Andrew Mayne: I would imagine a lot of that was a blur. Andrew Mayne: And I remember here I remember being in a meeting where we talked about the low key research preview, which like really was. Andrew Mayne: Like, we really thought like, oh, this is because it's it was the 3. 5. Andrew Mayne: 3. 5 was a model that had been out for months. Andrew Mayne: And from a capabilities point of view, when you just look at the evals, you're like, yeah, it's the same thing, but we just put the interface in here and made it so you didn't have to prompt as much. Andrew Mayne: And then ChatGPT comes out. Andrew Mayne: And when was the first sign that this thing was blowing up? Nick Turley: I mean, I'm curious for everyone has their slightly own recollection of that era because it was a very confusing time. Nick Turley: But for me, day one was sort of, you know, is the dashboard broken? Nick Turley: Classic like, the logging can't be right. Nick Turley: Day two was like, oh, weird. Nick Turley: I guess Japanese Reddit users discovered this thing. Nick Turley: Maybe it's like a local phenomenon. Nick Turley: Day three was like, okay, it's going viral, but it's definitely gonna die off. Nick Turley: And then by day four, you're like, okay, it's Nick Turley: gonna change the world. Andrew Mayne: Mark, did you have any expectation about that? Mark Chen: No. Mark Chen: Honestly, I mean, we've had so many launches, so many previews over time, and, yeah, this one really was something else. Mark Chen: Right? Mark Chen: The takeoff ramp was huge, and, yeah, my parents just stopped asking me to go work for Google. Andrew Mayne: Wait. Andrew Mayne: So wait. Andrew Mayne: Wait a second. Andrew Mayne: Up until ChatGPT, your parents were asking, like, what you're doing here? Mark Chen: Yeah. Mark Chen: No. Mark Chen: I mean, yeah, they just never heard of OpenAI. Mark Chen: Right. Mark Chen: I think for many years thought AGI was this pie in the sky thing, and I wasn't having a serious job. Mark Chen: So it was a real revelation for them. Andrew Mayne: Yeah. Andrew Mayne: What was your job title at the time? Mark Chen: I think just member of technical staff. Andrew Mayne: Yeah. Andrew Mayne: And then that blew up, and now you're head of research? Mark Chen: I guess so. Mark Chen: Yeah. Andrew Mayne: So alright. Mark Chen: Yeah. Mark Chen: Actually, on the GPT name, I think even half of research doesn't know what those three letters stand for. Mark Chen: It's kind of funny. Mark Chen: You know, like, half of them think it's generative pretrained. Mark Chen: pre-trained transformer. Andrew Mayne: And what is it? Mark Chen: It's the latter. Andrew Mayne: Okay. Andrew Mayne: Alright. Andrew Mayne: Yeah. Andrew Mayne: Those people, they don't know the name of it. Andrew Mayne: Yeah. Andrew Mayne: It's weird how just a silly name like that all of a sudden becomes a thing. Andrew Mayne: But you see that with like, you know, Google, Yahoo, Kleenex, things like that. Andrew Mayne: Xerox. Andrew Mayne: And sometimes they were some of those were names by intention, and this was really just a silly sort of name. Andrew Mayne: For me, the moment that I felt like after watching the launch, watching it accelerate, I knew what was gonna happen, and then what it did was when it was on South Park.
ChatGPT’s viral takeoff
Andrew Mayne: And remember that when South Park made fun of the name and Nick Turley: That was the first time I'd watched South Park in Oh. Nick Turley: Let's just say a while. Nick Turley: And that episode, I still think it's magic. Nick Turley: Yeah. Nick Turley: It was obviously profound to watch and see, you know, something you helped make show up in pop culture. Nick Turley: But there's the punch line in the end where it's like, oh, this was co written by ChatGPT. Andrew Mayne: I think they Andrew Mayne: took that off, though. Andrew Mayne: I think in Mark Chen: When did? later episodes, because it used to say, I think, by, like Nick Turley: Oh, man. Andrew Mayne: Trey Parker and, like, And then no, it was. Andrew Mayne: And then I think later I think they may have pulled that off at some point. Andrew Mayne: I don't remember. Nick Turley: Oh, I Nick Turley: strongly feel that you shouldn't have to give credit to it. Nick Turley: It's Yeah. Nick Turley: It's always necessary that you're using the Andrew Mayne: If I Andrew Mayne: had to give credit to ChatGPT for every aspect of my life. Andrew Mayne: Well, might as well just say "ChatGPT maybe with Andrew. " Andrew Mayne: True. Mark Chen: Did you use it for prep for your interviews? Andrew Mayne: You know, one of my co-producers, Justin, probably uses it. Andrew Mayne: I haven't asked him yet because I'd like to think that he's handcrafting every single question that we're thinking about here, but I am sure. Andrew Mayne: You say it was a bit of a blur. Andrew Mayne: I'll tell you, like, standout moment for me at the launch of Chad GPT was I don't know if you remember this, but the Christmas party. Andrew Mayne: And we'd had several weeks of ChatGPT out there, and Sam Altman went up and said, "Hey, it's been exciting to watch this, but the Internet being the Internet and I think we all felt this way, it's gonna die down. " Andrew Mayne: Spoiler alert. Andrew Mayne: It did not die down, and it just kept accelerating. Andrew Mayne: What were the things you had to do internally to sort of keep this thing up and running as more people wanted to use it? Nick Turley: We had quite a few constraints. Nick Turley: And for those of you who remember, you know, I think you guys remember ChatGPT was down all the time Andrew Mayne: Yeah. Nick Turley: In the beginning, and that was yeah. Nick Turley: We'd said, hey, this is a research preview. Nick Turley: No guarantees. Nick Turley: Maybe it goes down, but the minute you had people loving and using this thing, that didn't feel super good, so people were certainly working around the clock to keep the site up. Nick Turley: I remember, you know, we obviously ran out of GPUs. Nick Turley: We ran out of database connections. Nick Turley: We had, you know we're getting rate limited in some of our providers. Nick Turley: Nothing was really set up to run a product. Nick Turley: So in the beginning, we built this thing. Nick Turley: We called it the fail whale, and it would just tell you kind of nicely that the thing was down and made a little poem, I think it was generated by GPT-3, about being down and sort of tongue in cheek. Nick Turley: And that got us through the winter break because we did want people to have some sort of a holiday. Nick Turley: And then when we came back, we were like, okay, this is clearly not viable. Nick Turley: You can't just go down all the time. Nick Turley: And eventually, we got to something we could serve everyone. Mark Chen: Yeah. Mark Chen: And I think, you know, the demand really speaks to the generality of ChatGPT. Mark Chen: Right? Mark Chen: We had this thesis that ChatGPT embodied what we wanted in AGI just because it was so general. Mark Chen: And I think, you know, you're seeing that demand ramp just because people are realizing, you know, any use case that I want to give or to throw to the model, it can handle. Andrew Mayne: We were kind of known as the company working on AGI. Andrew Mayne: And I think prior to ChatGPT, the API was certainly the first time we had a public offering where people could go use it and do it, but then it was more for developers and stuff. Andrew Mayne: And I think that as long as people were sort of thinking AGI, that seemed to be the point at which people thought these models would be useful. Andrew Mayne: But we saw GPT-3, we saw that was useful, and then we saw that we could do other things that were useful.
Internal debate before launch
Andrew Mayne: Was everybody at OpenAI on board with ChatGPT being useful or being ready to launch? Mark Chen: Yeah. Mark Chen: I don't think so. Mark Chen: You know? Mark Chen: Even the night before, I mean, there's this very famous story at OpenAI of, you know, Ilya taking 10 cracks at the model, you know, 10 tough questions. Mark Chen: And my recollection is maybe only on five of them, he got answers that he thought were acceptable. Mark Chen: And so there's a real decision the night before. Mark Chen: Do we actually launch this thing? Mark Chen: Is the world actually gonna respond to this? Mark Chen: And I think it just speaks to when you build these models in house, you so rapidly adapt to the capabilities. Nick Turley: Mhmm. Mark Chen: And it's hard for you to kinda put yourself in the shoes of someone who hasn't kind of been in this model training loop and see that there is real magic there. Nick Turley: Yeah. Nick Turley: I think to build Nick Turley: on the, like, the condoracy internally about, you know, is this thing good enough to launch, I think, is humbling, right, because it's just a reminder of how wrong we all are when it comes to AI. Nick Turley: It's why, you know, Nick Turley: frequent contact with reality is so important. Andrew Mayne: Could you elaborate more on that contact with reality? Andrew Mayne: What does that mean? Mark Chen: Yeah. Mark Chen: I mean, when you think about iterative deployment, one way I like to frame it is, you know, there's no point everyone agrees where it's suddenly useful. Mark Chen: Right? Mark Chen: And I think usefulness is this big spectrum. Mark Chen: And so, you know, there's not one capability level or one bar that you meet, and suddenly, you know, the model is useful for everyone. Andrew Mayne: Were there any hard decisions about what to include or what to focus on? Nick Turley: We were very, very Nick Turley: principled on ChatGPT to not balloon the scope. Nick Turley: We were adamant to get feedback and data as quickly as we could. Andrew Mayne: I'm always in Slack telling you things about the way that didn't make Add this. Nick Turley: I remember actually there was a lot Nick Turley: of controversy about the UI side. Nick Turley: For example, we didn't launch with history, even though we thought people would probably want that, and guess what? Nick Turley: That was the first request. Nick Turley: I also think there's always the question, can Nick Turley: we train an even better model with two weeks more time? Nick Turley: I'm glad we didn't because we, I think, got a ton of feedback as we did. Nick Turley: So, yeah, there was a ton of the scope discussions, the holidays were coming up, so I think we had this natural forcing function for getting something out. Andrew Mayne: Yeah, there was this habit of things that if it's gonna come after a certain point in November, it's not gonna come out until February. Andrew Mayne: There's a sort of window where things would fall on either side. Nick Turley: Well, would be the classic method in a Nick Turley: big tech company. Nick Turley: I think we're definitely a Nick Turley: bit more flexible on the ownership. Andrew Mayne: I felt like one of the big impacts was once people are out using it, it felt like the rate of these things improving was tremendous. Andrew Mayne: I don't know if that was something that we really had in the calculus. Andrew Mayne: We could certainly think about training on larger site more data, scaling compute, but then the idea of actually having the signal you would get from that many people using it.
Evolution of OpenAI’s launch approach
Mark Chen: Yeah. Mark Chen: I think over time, feedback really has become an integral part of how we build the product. Mark Chen: And it's also become an integral part of safety. Mark Chen: And so you always feel the time cost of losing out on feedback. Mark Chen: You know, you can deliberate in a vacuum. Mark Chen: Right? Mark Chen: Are they gonna respond to this better? that better? Mark Chen: But it's just not a substitute for just bringing it out there. Mark Chen: Right? Mark Chen: I think our philosophy is let the models have contact with the world. Mark Chen: And if you need to revert something, that's fine. Mark Chen: But I think there's really no substitute for this fast feedback, and it's become one of the big levers for how we improve model performance too. Nick Turley: It's sort of funny. Nick Turley: Like, I feel like we started with shipping these models in a way that is more similar to hardware where you make, like, one launch. Nick Turley: Very rarely, and it has to be right, and, you know, you're not gonna update the thing, and then you're gonna work on the next big project. Nick Turley: It's capital intensive, and the timelines are long. Nick Turley: And over time, and I think ChatGPT was kind Nick Turley: of the beginning, it's looked more like software to me, where you make these frequent updates. Nick Turley: Mhmm. Nick Turley: You have kind of Nick Turley: a constant pace the world can adopt. Nick Turley: Something doesn't work, you pull it back, and you sort of lower the stakes in doing that, and you increase the empiricism. Nick Turley: And of course, just operationally too, you can innovate faster in a Nick Turley: way that is more and more in touch with what users want. Andrew Mayne: Yeah. Andrew Mayne: One of the examples we had of that was the model becoming too obsequious or sycophantic. Andrew Mayne: Could you explain what happened there?
The sycophancy incident and RLHF
Andrew Mayne: That was where people all of a sudden say, hey. Andrew Mayne: It's telling me I've got a 190 IQ. Andrew Mayne: I'm the most handsome person in the world, which I had no problem with personally. Andrew Mayne: But other people did. Andrew Mayne: And what was going on there? Mark Chen: Yeah. Mark Chen: So I think one important thing is we rely on user feedback to move the models. Mark Chen: Right? Mark Chen: And it's this very complicated mix of reward models, which we use in a procedure we call RLHF, using human feedback to use RL to improve the models. Andrew Mayne: Could you give me just a brief example of what that would mean? Mark Chen: Yeah. Mark Chen: So I think one way to think about it is when a user enjoys a conversation, you know, they provide some positive signal. Andrew Mayne: Thumbs up. Mark Chen: Yeah. Mark Chen: A thumbs up, for instance. Mark Chen: And we train the model to prefer to respond in a way that would elicit more thumbs up. Mark Chen: Right? Mark Chen: And this may be obvious in retrospect, but stuff like that, if balanced incorrectly, can lead to the model being more sycophantic. Mark Chen: Right? Mark Chen: You can imagine users might want that kind of that feeling of, you know, a model saying good things about them, but I don't think it's a very good long term outcome. Mark Chen: And, actually, when we look at kind of our response to and the rollout that resulted there, I think there were a lot of good points about it. Mark Chen: You know, this was something that was flagged just by a small fraction of our power users. Mark Chen: It wasn't, you know, something that a lot of people who generally use the models noticed. Mark Chen: And I think we really picked that out fairly early. Mark Chen: We responded to it, I think, with the appropriate level of gravity. Mark Chen: Mhmm. Mark Chen: And, yeah, I think it just shows that, you know, we really do take these issues quite seriously, and we wanna intercept them very early. Andrew Mayne: Yeah. Andrew Mayne: It felt like there was maybe forty eight hours since the model came out, and then Andrew Mayne: Joanne Zhang had a response explaining exactly what happened. Andrew Mayne: And I think that that's the hard part. Andrew Mayne: How do you navigate that? Andrew Mayne: Because the problem with social media is you're basically monetized by engagement time. Andrew Mayne: You wanna keep people on there longer so you can show them more ads. Andrew Mayne: And certainly, the more people use ChatGPT, obviously there's a cost to open an ad. Andrew Mayne: The idea is maybe use it once and stay around forever, but that's not practical. Andrew Mayne: How do you weigh that? Andrew Mayne: The idea of making people happy with what they're getting versus making the model be broadly more useful than just pleasing? Nick Turley: I feel very lucky in this regard because we have a product that's very utilitarian. Nick Turley: People use it to either achieve things that they do know how to do but don't feel like doing faster or with less effort, or they're using it to do things that they couldn't do at all. Nick Turley: You know? Nick Turley: First example is maybe, you know, writing an email that you've been dreading. Nick Turley: Second example might be, you know, running a data analysis that you didn't actually know how to do in Excel. Nick Turley: True story. Nick Turley: So so, you know, those are very utilitarian things. Nick Turley: Fundamentally, as you improve, you actually spend less time on the product. Nick Turley: Right? Nick Turley: Because, you know, ideally, it takes less turns back and forth, or maybe you actually delegate to the EIs so you're not in the product at all. Nick Turley: So for us, you know, time spent, it's very much not the not the Nick Turley: thing we optimize for. Nick Turley: You know, we do care about your long term retention because Nick Turley: we do think that's a sign of value. Nick Turley: If you're coming back three months later, that clearly means we did something right. Nick Turley: But what that means is, you know, I always say, show me the incentive, and I'll share the outcome. Nick Turley: We have, I think, the right fundamental incentives to build something great. Nick Turley: That doesn't mean we'll always get it right. Nick Turley: The sycophancy events were really, really important and good learning for us, and I'm proud of how we acted on it. Nick Turley: But fundamentally, I think we have the right setup to build something awesome. Andrew Mayne: So that brings up the challenge. Andrew Mayne: I wonder how you navigate that is that one of the things early on when ChatGPT came out, was the allegation that it's woke, and people are trying to promote some sort of agenda from it.
Balancing usefulness vs. neutrality in model behavior
Andrew Mayne: My argument always been, you train a model on corporate speak, average news and a lot of academia. Andrew Mayne: That's gonna kind of follow into that. Andrew Mayne: And I remember Elon Musk was very critical about it. Andrew Mayne: And then when he trained the first version of Grok, it did the same thing. Andrew Mayne: And then he's like, oh, yeah. Andrew Mayne: When you trained it on this sort of thing, it did that. Andrew Mayne: And internally at OpenAI, there were discussions about how do we make the model not try to push you, not try to steer you. Andrew Mayne: Could you go a little bit how you try to make that work? Mark Chen: Yeah. Mark Chen: So I think at its core, it's a measurement problem. Mark Chen: Right? Mark Chen: And I think it's actually bad to downplay these kind of concerns because they are very important things. Mark Chen: Right? Mark Chen: And we need to make sure that the model, the default behavior that you get is something that's centered that, you know, doesn't reflect bias on the political spectrum or in many other, you know, axis of bias. Mark Chen: And at the same time, you know, you do want to allow the user the capability to you know, if you wanted to talk to a reflection of something with more conservative values to be able to steer that a little bit. Mark Chen: Right? Mark Chen: Mhmm. Mark Chen: Or liberal values. Mark Chen: Right? Mark Chen: And so I think the thing is you wanna make sure that defaults are meaningful and they're centered, and that's a measurement problem. Mark Chen: Mhmm. Mark Chen: And you also want to give ability some flexibility, right, within bounds to steer the model to be a persona that you wanted to talk to. Nick Turley: I think that's right. Nick Turley: I think, you know, in addition to neutral defaults, abilities to bring your own values to some extent, I think, you know, being transparent about the whole thing is, I think, really, really important. Nick Turley: I'm not a fan of of, you know, secret system messages that, you know, try to, like, you know, hack the model into saying or not saying something. Nick Turley: What we've tried to do is publish our specs. Nick Turley: So you can go look at, you know, if you're getting certain model behavior, is that a bug? Nick Turley: You know, is it a violation of our own stated spec, or is it actually in the spec, in which case you know who to criticize and who to yell at? Nick Turley: Or is it just under specified in the spec, in which case that allows us to improve it and add more specificity into that document? Nick Turley: So by sort of publishing the rules of the AI that it's supposed to be following, I think that's an important step to have more people contribute to the conversation than just the people inside of OpenAI. Andrew Mayne: So we're talking about the system prompt, the part of the instruction that the model gets before the user puts the input. Mark Chen: Well, I think it's beyond that. Nick Turley: System Prompt is one way to steer the model, but it, you know, it goes much deeper into that. Nick Turley: Right? Nick Turley: You know? Nick Turley: Yeah. Mark Chen: We have a very large document that outlines across a bunch of different behavior categories how we expect the model to behave. Mark Chen: And just to give you an example here. Mark Chen: Right? Mark Chen: You can imagine if there's someone who comes in with just, like, a incorrect belief, just a factually incorrect Andrew Mayne: Mhmm. Mark Chen: Kind of a point of view. Mark Chen: How should the model interact with that user? Mark Chen: Right? Mark Chen: And should it reject that point of view outright, or should it collaborate with the user on kind of figuring out what's true together? Mark Chen: And, you know, we take that latter point of view, and I think there are a lot of very subtle decisions like this, which we put a lot of time in. Andrew Mayne: Yeah. Andrew Mayne: That that's a hard one because I think some things you can test for and you can try to figure out in advance, but when you're trying to figure out how an entire culture is gonna adopt something that's challenging, like if I was somebody who's convinced that the world was flat, you know, like how much should the model push back against me? Andrew Mayne: And some people are like, oh, it should push that back all the way, but it's okay. Andrew Mayne: What if you're one religion or not another? Nick Turley: Yeah. Nick Turley: Turns out rational people and well many people can disagree on how the model should behave in these instances. Nick Turley: And you're not always gonna get it right, but you can be transparent about what approach we took. Nick Turley: You can allow users to customize it, and I think this is our approach. Nick Turley: I'm sure there's ways we can improve on it, but I think transparent and the open about how we're trying to tackle it, we can get feedback. Andrew Mayne: How are you thinking about as people start to use these models more and more, regardless of whether or not that's some dial you're trying to turn, it's just the more useful it becomes, the more people want to use it. Andrew Mayne: There was a time when nobody wanted a cell phone, and now we can't get away from them. Andrew Mayne: And how are you thinking about relationships people are forming with their systems? Nick Turley: Obviously, I mentioned this earlier. Nick Turley: This is a technology you have to study. Nick Turley: It's not designed in a static way to do x, y, z. Nick Turley: It's highly empirical. Nick Turley: So, you know, as people adopt in the way that they use the product, it's something that we need to go understand and act on as well. Nick Turley: I've been observing this trend with interest where I think, know, increasing number of people, especially Gen Z and, you know, younger populations are coming to ChatGPT as a thought partner, and I think in many cases, that's really helpful and beneficial because you've got someone to brainstorm on a relationship question, professional question, Nick Turley: or something else. Nick Turley: But in some cases, it can Nick Turley: be harmful as well, and I think detecting those scenarios and first and foremost, having the right model behavior is very, very important to us. Nick Turley: Actively monitoring it, and in some ways, it's one of those problems we're gonna have to grapple with because with any technology that becomes ubiquitous, it's gonna be dual use. Nick Turley: People are gonna use it for all this awesome stuff, and people are gonna use it in ways that we wish they didn't. Nick Turley: And we have some responsibility to make sure that we handle that with the appropriate gravity.
Memory and the future of personalization
Andrew Mayne: I find myself having longer conversations with it. Andrew Mayne: I like the memory function. fact you can turn it off if you don't want. Andrew Mayne: And I think about, like, you know, what's this gonna be two years from now or three years from now when it has a much longer memory, much more context with this? Andrew Mayne: I like the idea to have these sort of like, you know, Memento anonymous modes too, where it's not gonna store this. Andrew Mayne: But I kind of wonder how much you've been thinking about two years, three years down the road. Andrew Mayne: What what's that going to be like when ChatGPT knows way more about you? Mark Chen: Yeah. Mark Chen: I mean, think memory is just such a powerful feature. Mark Chen: In fact, it's one of the most requested features when we talk to people externally. Mark Chen: It's like, this is the thing I really wanna pay more for. Mark Chen: And I think, you know, you liken it to if you've ever kind of had a personal assistant, you know, you Andrew Mayne: No. Andrew Mayne: I'm not. Mark Chen: Well, you do need to Nick Turley: build up contacts more Andrew Mayne: over time. Andrew Mayne: Mean, I'm sorry, guys. Andrew Mayne: I'm sorry, guys. Mark Chen: But, you know, it's yeah. Mark Chen: It's just like it's kind of in any kind of relationship that you have with a person. Mark Chen: Right? Mark Chen: You build up contacts with them over time. Mark Chen: Mhmm. Mark Chen: And I think just the more they know about you, right, the richer the relationship, the more, you know, it can also help you. Mark Chen: Right? Mark Chen: You can work together to collaborate on tasks together. Andrew Mayne: I do become self conscious of the fact that it knows everything about me when I'm grumpy, and I've I've argued with it recently, by the way. Nick Turley: That's good. Nick Turley: Yeah. Nick Turley: You should be able to argue with it. Nick Turley: You understand a lot about yourself and having a thing to argue Nick Turley: with, and Nick Turley: I think you spare others of that experience, which can also be beneficial. Mark Chen: Don't argue on math and science. Mark Chen: You're not gonna win those. Nick Turley: Yeah. Nick Turley: No. Nick Turley: Think Increasingly, very unlikely. Nick Turley: Yeah. Nick Turley: Think memory's cool. Nick Turley: And to Mark's point, it's been part of our vision for a long time because we said we were gonna build a super assistant before we really knew what that meant. Nick Turley: ChatGPT was sort of the early demonstration to that idea. Nick Turley: But if you kind of think about real world intelligences, even they are not particularly useful on their first day, and I think being able to solve that problem or begin has been profound. Nick Turley: To your earlier question, though, it really does feel like if you fast forward a year or two, ChatGPT or things like it are gonna be your most valuable account by far. Nick Turley: It's gonna know so much about you, and that's why I think giving people ways to talk about this thing in private is very important. Nick Turley: We make this temp chat thing very, it's literally on the home screen because we think it's increasingly important to Nick Turley: talk about stuff sort of Nick Turley: off the record too. Nick Turley: It's an interesting question, and I think privacy and AI is gonna be an interesting one for the next coming I Andrew Mayne: wanna switch gears, talk about another release, which again, kinda caught people by surprise and blew up, was ImageGen. Andrew Mayne: And I was here for DALL-E, DALL-E 2, and then DALL-E 3 came out. Andrew Mayne: I thought DALL-E 3, I thought, a very capable model, but it seemed like it preferred a certain kind of image, and a lot of the utility and the capabilities for variable binding was sort of kind of hidden away.
ImageGen’s breakthrough moment
Andrew Mayne: And then ImageGen was kind of just this breakthrough moment that it caught me off guard. Andrew Mayne: How did you guys feel about the launch of that? Mark Chen: Yeah. Mark Chen: Honestly, it caught me off guard too. Mark Chen: And this is really props to the research team. Mark Chen: Know? Mark Chen: Gabe, in particular, did a ton of work here. Mark Chen: Kenji, many others on it. Nick Turley: It's amazing. Mark Chen: Did phenomenal work. Mark Chen: And I think it really spoke to this thesis that when you get a model just good enough that in one shot, it can generate an image that fits your prompt, that's gonna create immense value. Mark Chen: And I think we never quite had that before, right, that you just get the perfect generation oftentimes on the first try. Mark Chen: And I think that's something very powerful. Mark Chen: You know? Mark Chen: Like, people don't wanna pick the best out of a grid. Mark Chen: I think, yeah, you just got very good prompt following and, you know, this great style transfer too. Mark Chen: Right? Mark Chen: Yeah. Mark Chen: This ability to kind of put images as context for the models to modify and to change and the fidelity that you could do that with, I think that was really powerful for people. Nick Turley: I think this image and experience, it was just kind of another mini ChatGPT moment Andrew Mayne: Mhmm. Nick Turley: All over again, where, you know, you have kind of this you've been staring at this for a while, you're like, yeah, it's gonna be cool. Nick Turley: Think people are gonna like it, but you kinda, you know, you're launching like 20 different things, and then suddenly, the world is going crazy in a way that you kinda only find out by shipping. Nick Turley: Like, I remember distinctly, you know, we had, 5% of the Indian Internet population tried ImageGen over the weekend. Nick Turley: And I was like, wow, we're reaching new types of users who we wouldn't even have thought, you know, who might not have thought of using ChatGPT. Nick Turley: That's really cool. Nick Turley: And to Mark's point, I think a lot of this is because there's this discontinuity where something suddenly works so well and truly the way you expected, where I think it blows people's minds. Nick Turley: I think we're gonna have those moments in other modalities too. Nick Turley: I think voice, you know, it it hasn't quite passed the Turing test yet, but I think the minute it does, people are gonna, I think, find that immensely powerful and valuable, you know, video is gonna have its own moment where it starts meeting the expectations that users have. Nick Turley: So I'm really excited about the future because I think there's so many of these magical moments coming that are really gonna transform people people's lives. Nick Turley: And also, you change sort of ChatGPT's relevance for people because, you know, there's I've always felt like there's text people and there's image people, and some of them are a little bit different, and now they're all using the product and discovering the value across the board. Andrew Mayne: The moment when it launched, I think it kind of illustrated the problem that had been with image models before. Andrew Mayne: And when DALL-E came out, was super exciting because you're like, I'm doing pictures of space monkeys and all these sorts of things. Andrew Mayne: The moment you try to do a really complex image, and that's the phrase I brought up before, which is variable binding, you start to see these things drop off. Andrew Mayne: And that was when I realized, oh, there's gonna be a challenge for other image systems that don't have kind of the scale and the compute of, like, a GPT-4 under the hood. Andrew Mayne: And now, was it just was it basically that, like taking, like, a GPT-4 scale model and say, now you do images that made the breakthrough? Mark Chen: Well, I think there are a lot of different parts of research that made this such a big success. Mark Chen: Right? Mark Chen: I think with a complicated multistep pipeline, it's never just one thing. Mark Chen: Right? Mark Chen: It's, like, very good post training. Mark Chen: It's very good training. Mark Chen: And I think it's just all of that coming together. Mark Chen: Right? Mark Chen: Variable binding definitely was one thing that we paid a lot of attention to. Mark Chen: I think one thing about the ImageGen launch is a launch that was very deep. Mark Chen: I think people, you know, they started by working on, you know, creating anime versions of themselves. Mark Chen: Mhmm. Mark Chen: But you realize when you play with it more, know, the infographics, they work. Mark Chen: Oh, yeah. Mark Chen: Like, you actually create charts. Mark Chen: Can Comic book panels. Nick Turley: Yeah. Nick Turley: You can mock up what your home would look like with Exactly. Nick Turley: Different furniture in it. Nick Turley: Different furniture. Nick Turley: Exactly. Nick Turley: We've heard all these things from users that are, like, completely surprising about the way that Andrew Mayne: you see the We did the podcast setup by literally taking some photos of chairs in the and just putting in there and saying, create a better setup. Andrew Mayne: And it was Cool. Andrew Mayne: Amazing. Andrew Mayne: So we've seen kind of a lot of the you know, there was anime style images, which kinda like for some re it was just sort of the weird thing where it was just better than what we'd seen before. Andrew Mayne: And I don't think anybody is ready to be really surprised by an image model in that way. Andrew Mayne: I think, obviously, internally and externally, what were some of the things that surprised you or some of the new things you saw people doing? Mark Chen: Yeah. Mark Chen: I'll be I'll tell you a quick story there too. Mark Chen: Because, you know, up until the day of launch, we're trying to figure out what's the right use case to showcase, you know, like and I think I'm so glad we ended up on kind of anime styling. Mark Chen: It's just everyone looks good as an animated Andrew Mayne: character, so Nick Turley: That's true. Nick Turley: I mean, it's funny. Nick Turley: With the original ChatGPT, I thought it would be a strictly utilitarian product, and then I was surprised that people use it for fun. Nick Turley: In this case, it was sort of the opposite, where I was like, okay, this is gonna be really cool for memes. Nick Turley: People are gonna like have fun with this thing. Nick Turley: But then I was, like, really surprised by all the, like, genuinely useful ways of using ImageGen, whether or not it's, you know, planning your home project, as I mentioned earlier, you know, of you're doing construction, you wanna see what things would look like if, you know, you had this remodel or this furniture or whatever, to you're working on a slide deck for this important presentation, and you just wanna have really useful, consistent illustrations that are on topic and get it. Nick Turley: So I really have been kind of personally surprised by the utility in this case because I knew it would be fun. Nick Turley: That was not a question. Mark Chen: Yeah. Mark Chen: I think I used it to generate a tier list of AI companies, and then put it opening at the top. Andrew Mayne: You win, model. Andrew Mayne: Good post training. Andrew Mayne: Yeah. Andrew Mayne: It just happened. Andrew Mayne: Who knew? Andrew Mayne: What has been the thinking in it's changed, because I remember originally with DALL-E, the idea of like, Okay, we have to be lot of very controlled about what it can do, what it can't do. Andrew Mayne: Originally, I remember when we first launched, you couldn't do people, which was not a very useful model. Andrew Mayne: And then finally was trying to roll back. Andrew Mayne: How much of that was cultural shift? the technological ability to control for things?
Cultural shifts in safety and the freedom to explore
Andrew Mayne: And how much of that was just saying we've gotta push the norms? Nick Turley: I would say it was both cultural shift and improvement in our ability control things. Nick Turley: The cultural shift, you know, I'm not gonna deny it. Nick Turley: I think when I joined OpenAI, there was a lot of conservatism around, you know, what capabilities we should give to users, maybe for good reason. Nick Turley: The technology is really new. Nick Turley: A lot of us were new to working on it, and, you know, if you're gonna have a bias, you know, biasing towards safety and being careful, it's not a bad, you know, DNA to have. Nick Turley: But I think over time we learned that there's so many positive use cases that you effectively prevent when you make arbitrary restrictions in the model. Andrew Mayne: What about faces? Andrew Mayne: Why not? Andrew Mayne: Why can't I make any face I want? Nick Turley: So this is a good example of a capability that's got pros and cons, and you can air on one side or the other. Nick Turley: You know, when we first shipped image uploads into ChatGPT, we had some debates, you know, about what capabilities do you allow versus where are you conservative? Nick Turley: And I think one debate that we had is, like, do we upload allow the upload of images with faces? Nick Turley: Or rather, when you upload an image that contains a face, do you know, should we just, like, gray out the face? Nick Turley: Because you avoid so many problems. Nick Turley: Right? Nick Turley: Yeah. Nick Turley: You can make inferences about people based on their face. Nick Turley: You could say mean things to people based on Nick Turley: their face. Nick Turley: And and, you Nick Turley: know, you would just take a giant shortcut on all the gnarly issues if you didn't allow that, but I've always felt we need to on the side of freedom, and we need to do the hard work. Nick Turley: And I think in this case, you know, there's so many valid ways. Nick Turley: You know, if I want feedback on makeup or on my haircut or anything like that, I wanna be able to talk to ChatGPT about it. Nick Turley: That is our valuable and benign use cases. Nick Turley: And I would prefer to allow and then study, you know, where does that fall short? Nick Turley: Where is that harmful? Nick Turley: And then iterate from there versus taking a default stance on disallowed. Nick Turley: And I think that's one of those ways in which our stance and posture has changed a bit over time in terms of where we set, you know, where we start. Andrew Mayne: Yeah. Andrew Mayne: We're very good, I think, imagining worst case scenarios. Andrew Mayne: What if I use these faces to evaluate hires for a company or whatever? Andrew Mayne: But also it's like, hey, is this eczema? Andrew Mayne: Know, there's a lot of utility there. Nick Turley: And honestly, I think there are certain demands of AI safety where worst case scenario thinking is very appropriate. Nick Turley: Mhmm. Nick Turley: So I think that is an important way of thinking about risk when it comes to certain forms of risks that are existential or even just very, very bad. Nick Turley: You know? Nick Turley: We have the preparedness framework, which helps us reason through some of those things. Nick Turley: You know? Nick Turley: Can the AI let you make Nick Turley: it a bioweapon? Nick Turley: It's good to think about the worst case there. Nick Turley: It could be really, really bad. Nick Turley: So you kind of have to have that way of thinking in the company, and Nick Turley: you have to have certain topics where you think about safety in that way, but you can't let that kind of thinking spill over onto other domains of safety where the stakes are lower because you end up, I think, making very, very conservative decisions that block out many valuable use cases. Nick Turley: So I think being sort of principled about different types of safety on different time horizons and with different levels of stakes is very important for us. Andrew Mayne: I think I want a blunt mode sometimes and just like right You Nick Turley: where it actually roasts you? Andrew Mayne: Well, I mean, like, yeah, because I'll ask the model, because with the voice in, speech out model, be like, do I sound tired? Andrew Mayne: And it's like, well, you know, I don't really wanna, you know, and I'll be like, yeah, you know, just trying to get it to be honest. Nick Turley: You know, I think there's many cultures that would prefer a blunter ChatGPT, so very much on the radar. Mark Chen: Yeah. Mark Chen: Just to piggyback off Nick's answer, I think it's the iterative deployment that gives us the confidence, right, to push towards user freedom. Mark Chen: And, you know, we've had many cycles of this. Mark Chen: We know what users can and can't do. Mark Chen: And that gives us the confidence to launch with the restrictions that we do. Andrew Mayne: One of the other capabilities generative capabilities that's been very interesting has been code. Andrew Mayne: And I remember early on GPT-3, we saw that all of a sudden it could spit out entire React components, and we saw that, oh, wow, there's some utility there. Andrew Mayne: And then we went. Andrew Mayne: We actually trained a model more specifically on code. Andrew Mayne: And that led to we had Codex. Andrew Mayne: And we had CodeInterpreter. Andrew Mayne: Now Codex is somehow back.
Code, Codex, and the rise of agentic programming
Andrew Mayne: And a new form, same name, but the capabilities keep increasing. Andrew Mayne: And we've seen code work its way first into Versus Code via Copilot, and then Cursor, and then Windsurf, which I use all the time now. Andrew Mayne: What how much pressure has there been in the code space? Andrew Mayne: Because I'd say that if we ask people who made the top code model, we might get different answers. Mark Chen: Yeah. Mark Chen: And I think it reflects that when people talk about coding, they're talking about a lot of different things. Mark Chen: Right? Mark Chen: I think there's coding in a specific paradigm. Mark Chen: Like, if you pull up an ID and you wanna kinda get a completion on a function, that's very different from, you know, agentic style coding where, you know, you ask, you know, I want this PR. Mark Chen: And, you know, and I think we've done a lot of focus. Andrew Mayne: Could you, could you unpack a little bit what you mean by agentic coding? Mark Chen: Yeah. Mark Chen: So I think when you can draw a distinction between more kind of real time response models. Mark Chen: You can Think of ChatGPT to first order as you ask a prompt, and then you get a response fairly quickly. Mark Chen: And a more agentic style model where you give it a fairly complicated task. Mark Chen: You let it work in the background, and after some amount of time, it comes back to you with what it thinks is something close to the best answer. Mark Chen: Mhmm. Nick Turley: Right? Mark Chen: And I think we see increasingly that the future will look like more of a async kind of a you know, where you're asking it very difficult, hard things. Mark Chen: And you're letting the model think and reason and come back to you with really the best version of, like, what it can come back with. Mark Chen: And we see the evolution of code in that way too. Mark Chen: I think, eventually, we do see a world where you'll kind of give a very high level description of what you want Mhmm. Mark Chen: And the model will take time, and it'll come back to you. Mark Chen: And so I think our first launch Codex really reflects that kind of paradigm where we are giving it PRs units of fairly heavy work. Mark Chen: That encapsulate, you know, a new feature or, you know, a big bug fix, and we want the model to spend a lot of time thinking about how to accomplish this thing rather than kinda give you a fast response. Mark Chen: Mhmm. Nick Turley: And to get to your question, you know, there's there's coding is such a giant space. Nick Turley: There's so many different angles at it. Nick Turley: It's kinda like talking about knowledge work or something incredibly broad, which is why I don't think there's one winner, and I think there's one best thing. Nick Turley: I think there's so many options, and I think developers are the lucky ones because they have so many choices right now, and I think that's fundamentally exciting for us too. Nick Turley: But to Mark's point, I think this agentic paradigm has been particularly exciting for us. Nick Turley: One framing I often use when thinking about product here is I wanna build products that have the properties such that, you know, if the model gets two x better, product gets two x more useful. Nick Turley: You know, think, yeah, ChatGPT has been a wonderful thing because for a long time, I think that was true, but I think as we look at, you know, smarter and smarter models, I think there's some limit to people's desire to talk to, like, a PhD student versus, you know, they might value other attributes of the model, like its personality and what it can actually do in the real world. Nick Turley: But experiences like Codex, I think they create the right body such that we can drop in smarter and smarter models, and it's gonna be quite transformative because you get the interaction paradigm right where people can specify this task, give the model time, and then get a result back. Nick Turley: So I'm really excited where it's gonna go. Nick Turley: It's an early research preview, but just like with ChatGPT, we felt like it would be beneficial to get feedback as early as possible, and excited where we're gonna take it. Andrew Mayne: I was using Sonnet a lot, which I love. Andrew Mayne: I think Sonnet for coding is fantastic, but with o4-mini-medium setting in Windsurf, I found it was great. Andrew Mayne: I found that once I started using that, I was really happy because, one, the speed, everything else like that. Andrew Mayne: And I think that and I think there are very good reasons why people like other models, and I don't want to get into comparison. Andrew Mayne: But I found that for me, the kinds tasks I was using, this was the first time. Andrew Mayne: I was very happy you guys put that out there because Mark Chen: Absolutely. Mark Chen: Yeah. Mark Chen: And, you know, we feel like there's still a lot of low hanging fruit in code. Mark Chen: It is a big focus for us, and I think we'll find in the near future, you'll find many more good options for the right code model tailored for your use case. Andrew Mayne: Yeah. Andrew Mayne: I find often if I just need a quick answer to, like, how to write something in Dart, does it get a 4. 1 and say, what yeah. Andrew Mayne: Something bigger. Andrew Mayne: I think that's gonna be the harder part is because, yeah, these evals are some way saturated, but also everybody has their own criteria that we look at. Andrew Mayne: And that's going to be kind of a question to sort of see how are we going to adapt to all that.
Coding with taste
Mark Chen: Right. Mark Chen: Yeah. Mark Chen: I mean, specifically in code. Mark Chen: Right? Mark Chen: I think there's more beyond, did it get you the right answer? Mark Chen: With code, you know, people care about the style of the code. Mark Chen: They care about, you know, how verbose it was in the comments. Mark Chen: It cares about, you know, how much proactive work did the model do for you, right, on other functions. Mark Chen: And so I think, you know, there's a lot to get right, and users often have very different preferences here. Nick Turley: Yeah. Nick Turley: It's funny. Nick Turley: I used to you know, people used to Nick Turley: ask me, hey. Nick Turley: Well, what domains are gonna, like, you know, be transformed by, you know, the fastest. Nick Turley: And I used to say, you know, it's code because, like, similar to math and other things, it's very, very verifiable and decimel, and I think those are the domains that are particularly great to do RL on, and, you know, you're therefore gonna see all this awesome, you know, agentic stuff just suddenly work. Nick Turley: I still think that's true, but the thing that surprised me about code is that, you know, there is still so much of an element of taste in terms of what makes good code. Nick Turley: And there's, you know, there's a reason that, you know, people train to be a professional software engineer. Nick Turley: It's not because their IQ gets better because but rather because they learn, you know, how to build software inside an organization. Nick Turley: What does it mean to write good tests? Nick Turley: write good documentation? Nick Turley: How do Nick Turley: you respond when someone disagrees with your code? Nick Turley: Those are all actual elements of being a real software engineer that we're gonna have to teach these models to do. Nick Turley: So I expect progress to be fast, and I still think code has a ton of nice properties that make it very ripe for the Gentic products, but I do think it's very interesting to Nick Turley: the degree that the element of taste and style and real world software engineering matters. Andrew Mayne: It's interesting, too, because with ChatGPT and the other models, you're kind of dealing with having to bridge the divide between consumer and pro. Andrew Mayne: I open up ChatGPT, and I tell my friends, like, oh, yeah, because I'll plug it into whatever code model I'm working because I can actually connect it to there. Andrew Mayne: And I think about, well, that's a very different use case a lot of other people. Andrew Mayne: Although I've shown people how to go in and use an IDE and actually have it just write documents for you and create folders and stuff, which people don't realize, yeah, you could do that. Andrew Mayne: You could have ChatGPT actually control it and do that, which is cool. Andrew Mayne: But then you think about like, okay, we've got a tab now for images. Andrew Mayne: There's the Codex tab. Andrew Mayne: So if I want to connect to GitHub and have it work through there, and there's a Sora into there. Andrew Mayne: So it's kind of interesting to see how all of these things are coalescing into there. Andrew Mayne: How do you differentiate between a consumer feature, a professional feature, and maybe an enterprise feature? Nick Turley: Look, we build very general purpose technology, and it's going to be used by a whole range of folks. Nick Turley: And unlike many companies which have this kind of Nick Turley: founding user type, and then Nick Turley: they use technology to solve that user's problems, we do start off the net with the technology, observe who finds value in it, and then iterate for them. Nick Turley: Now with Codex, our goal was very much to build for professional software engineers, knowing though that there's sort of a splash zone where I think a lot of other people will find value in it, and we'll try to make it accessible for those people as well. Nick Turley: There are a lot of opportunities to target non engineers. Nick Turley: I'm personally really motivated to create a world where, you know or help build a world where anyone can make software. Nick Turley: Codex is not that product, but you could imagine those products existing over over time. Nick Turley: But, you know, as a general principle, it's really hard to predict exactly who the target user is until we made some of these general purpose technologies available because it gets back to the empiricism I was talking about. Nick Turley: We just never exactly know where the value's gonna lie. Mark Chen: Yeah. Mark Chen: And I think even to dig deeper into that, assuming, like, you know, you could have a person who's mostly using ChatGPT for coding, right, but 5% of the time, you know, they might just wanna talk to the model or, like, 5% of the time, they just want a really cool image. Mark Chen: Right? Mark Chen: And so I think, you know, there are certainly archetypes of people who use the models, but in practice, we see that people want this exposure to different capabilities. Mark Chen: Yeah. Andrew Mayne: With Codex and watching the launch of that, it kind of struck me there are some tools you see that there's a lot of excitement about because internal demand for that.
Internal adoption of Codex
Andrew Mayne: How much are you using it internally? Andrew Mayne: Are tools like that? Andrew Mayne: More and more. Andrew Mayne: Okay. Nick Turley: I've been really excited to see internal adoption. Nick Turley: It's everything from, you Nick Turley: know, exactly what you'd expect, you Nick Turley: know, people using Codex to offload their tests to, you know, we have a analyst workflow that will look at, you know, logging errors and automatically flag them and Slack people about it. Nick Turley: So there's all these ways that we're or I've actually heard some people Nick Turley: are using it as a to do where, like, future tasks they're hoping to do, they're starting to fire off codex tasks. Nick Turley: So this is the perfect type Nick Turley: of thing that I think you can talk with internally. Nick Turley: And and, you know, I'm very excited about, you know, the leverage that engineers are gonna get out of a tool like this. Nick Turley: I think it's gonna allow us to move faster with the people we have and make each engineer that we hire 10 times more productive. Nick Turley: So in some ways, internal usage is a very good predictor of where we wanna take this. Mark Chen: Yeah. Mark Chen: I mean, we don't wanna ship something to other people that we don't find value in ourselves. Mark Chen: And I think, you know, leading up to the launch Andrew Mayne: Laundry buddy. Mark Chen: Laundry Nick Turley: buddy is Nick Turley: an essential partner. Andrew Mayne: Okay. Andrew Mayne: Sorry. Mark Chen: I mean, yeah, we I mean, we had some power users, though, that, you know, hundreds of PRs a day that they were generating personally. Mark Chen: Right? Mark Chen: So I think, you know, there are people internally finding a lot of utility from what we're building. Nick Turley: Also, if you think about internal adoption, it's also a good reality check because, you know, people are busy, Adopting new tools takes some activation energy. Nick Turley: So actually, the thing you find when you try to dive through things internally is some of the reality component of how long it takes people to actually adjust to a new workflow, and it's been it's been humbling to to watch. Nick Turley: Right? Mark Chen: Mhmm. Nick Turley: So I think you learn both about the technology, but you also learn about some of the adoption patterns when you're trying to get a bunch of busy people to change the way they write code.
Skills that matter: curiosity, agency, adaptability
Andrew Mayne: As you build these tools, internally people have to learn how to use them and are having to adapt. Andrew Mayne: And there's a lot of question now about what kind of skills do people need in the future. Andrew Mayne: What kind of skills do you for on your teams? Nick Turley: I've thought about this a lot. Nick Turley: Hiring is hard, especially if you want to have a small team that is very, very good and humble and able to move fast, etcetera. Nick Turley: Mhmm. Nick Turley: And I think curiosity has been the number one thing that I've looked for, and it's actually my advice to students when they ask me, what do I do in this world where everything's changing? Nick Turley: Because, I mean, for us, there's so much that we don't know. Nick Turley: There's a certain amount of humility you have to have about building on this technology because you don't know what's valuable. Nick Turley: risky until you really study and go deep and try to understand. Nick Turley: And when it comes to working with AI, which we obviously do a lot, not just in code, but in kind of every facet of our work. Nick Turley: It's asking the right questions that is the bottleneck, not necessarily getting the answer. Nick Turley: So I really fundamentally believe that we need to hire people who are deeply curious about the world and what we do. Nick Turley: I care a little bit less about their experience in AI. Nick Turley: Mark presumably feels a bit different about that one, but for the product side, it's been curiosity that I've found the most the best predictor of success. Mark Chen: No. Mark Chen: I mean, even on research, I think increasingly less, we index on you have to have a PhD in AI. Mark Chen: Right? Mark Chen: I think this is a field that people can pick up fairly quickly. Mark Chen: I also came into the company as a resident without much formal AI training. Mark Chen: And I think correlated to what Nick said, I think one important thing is for our new hires to have agency. Mark Chen: Right? Mark Chen: Opening as a place where you're not gonna get so much of a oh, here's today, you're gonna do thing one, thing two, thing three. Mark Chen: It's really about being kind driven to find, hey. Mark Chen: Here's the problem. Mark Chen: You know, no one else is fixing it. Mark Chen: I'm just gonna go dive in and fix it. Mark Chen: And also adaptability. Andrew Mayne: Right? Andrew Mayne: It's a Mark Chen: very fast changing environment. Mark Chen: That's just the nature of the field right now, and you need to be able to quickly figure out what's important and pivot what you need to do. Nick Turley: The agency thing is real. Nick Turley: You know? Nick Turley: I think we often get asked for, you know, how does OpenAI keep shipping and, you know, you it feels like you're pushing something out every week or something like that. Nick Turley: It's, a, funny because it never feels to me. Nick Turley: I always feel like, you know, we could go going even faster. Nick Turley: But but, you know, I think fundamentally, we just have a lot of people with agency who can ship. Nick Turley: That comes to product, comes to research, that comes to policy. Nick Turley: Shipping can mean different things. Nick Turley: We all do very different things at OpenAI, but I think the ratio of people who can actually do things and the lack of red tape, except where it matters, there's a couple areas where I think red tape is very, very important. Nick Turley: But I that is what makes OpenAI very unique, and it obviously affects the type of people who we wanna hire, too. Andrew Mayne: I was brought into the company because I was originally given access to GPT-3, and I just started showing all these use cases for it and making videos every week for Yeah, and that was annoying people, I'm sure
OpenAI’s “Do Things” culture
Mark Chen: but I was not. Mark Chen: It was really fascinating. Andrew Mayne: It was exciting. an exciting time. Andrew Mayne: I described it to people like they I think they built a UFO and I get to play with it. Andrew Mayne: And then I make it hover and like, oh, you made it hover. Andrew Mayne: I'm like, well, they built it. Andrew Mayne: I just pressed the button and got to do that. Andrew Mayne: But that was just what I found very empowering, was the fact that I I'm self taught. Andrew Mayne: I learned to code by Udemy courses and stuff, and then to be a member of the engineering staff and be told just go do stuff. Andrew Mayne: Nothing too critical. Andrew Mayne: I didn't break anything to anybody. Andrew Mayne: And that's good to know that kind of spirit is still there. Andrew Mayne: And I think that is part of the reason why OpenAI is able to ship even though, you know, it was like a 150,200 people worked on GPT-4. Andrew Mayne: I think people forget about that. Andrew Mayne: You know? Nick Turley: Totally. Nick Turley: And honestly, this is how and even ChatGPT, this is how it came together. Nick Turley: You know, we had a research team. Nick Turley: They'd been working, you know, for a while on instruction following and then the successor did that and, you know, post training these models to be good at chat. Nick Turley: But the product effort came together as a hackathon. Nick Turley: I remember distinctly. Nick Turley: We said, like, who who who's excited to, you know, go build consumer products? Nick Turley: And we had all these different people. Nick Turley: Like, we had a guy from the supercomputing team who, you know, was like, I'll make an iOS app. Nick Turley: I've done that. Nick Turley: Mhmm. Nick Turley: In past life where we had, you know, a researcher who wrote some back end code, and it was just convergence of people who were excited to do stuff and I think the ability to do so. Nick Turley: And I think that's how you get the next strategy, is running an organization where where that is possible and continues to be possible as you scale. Andrew Mayne: Hackathons were my favorite thing, because one, being a performer and loving show and tell. Andrew Mayne: But it was just neat to be able to see things that you knew were gonna be a product or something later on. Andrew Mayne: Because when you're playing with the technology this advanced and all of that, do you guys still do them? Mark Chen: Yeah, absolutely. Mark Chen: Okay. Mark Chen: Yeah. Mark Chen: We've had some fairly recently, and they are typically tied Last week, actually. Nick Turley: Can't say what it was about, but it was Sure. Nick Turley: And it's how you find out what's possible. Andrew Mayne: Yeah. Andrew Mayne: I'm excited to hear that. Andrew Mayne: I do have a question, which is how much as it grows, again, like, when I started, I think, like, 150 people on the company. Andrew Mayne: Now there's, like, 2,000. Andrew Mayne: And then now, you know, I see a video with Sam talking to Johnny Ive. Andrew Mayne: And how much is that gonna change? Andrew Mayne: The character, the spirit of bringing in all this? Andrew Mayne: Think all the outside expertise has been great. Andrew Mayne: We've seen this great sort of run of products. Andrew Mayne: But do you see it changing the culture? Mark Chen: Well, I mean, I think probably in the right way. Mark Chen: Right? Mark Chen: It's like I think when we look at AI, we don't think of it as some fairly narrow thing, and we've always been kinda enthralled by just the potential and all the different things you could build with AI. Mark Chen: And, yeah, to Nick's point, right, this is why we're able to ship so quickly because people imagine all these different possibilities. Mark Chen: They imagine the future with AI, and they try to bring it about. Mark Chen: Right? Mark Chen: And I think these are facets of that imagination. Mark Chen: Right? Mark Chen: It's like, what does AI look like if you imagined a AI first device, for instance? Nick Turley: Yeah. Nick Turley: When you go from 200 to 2,000, you'd think a lot would change. Nick Turley: And, yeah, maybe in some ways it has, but I think people often underestimate, you know, the number of things that we're doing. Nick Turley: I always feel like being at OpenAI feels much closer to being in a university where, you know, you've got this kind of common reason to being there, but everyone's doing something different, and you'll sit down at dinner or at lunch, and you'll talk to someone and learn about their thing. Nick Turley: And you're like, wow, that's so cool that you're doing that. Nick Turley: And so it feels much smaller because I think of the sort of broad range of things we're doing, and therefore each individual effort, whether or that something like ChatGPT or something like Sora or etcetera, is actually staffed in a very, very conservative and lean way that then continues to keep people very autonomous and make sure they have resources, etcetera. Nick Turley: I think it's partly that has made it feel very, very similar in the good ways to when I started here. Andrew Mayne: We talked a bit about one of the things you look for is curiosity, and Mark said that's helpful, too. Andrew Mayne: If I'm somebody outside of AI, okay, if I'm 25 or I'm 50, and I'm looking at the advancement of technology and maybe having a little bit of fear because I see copywriting is one of the things that ChatGPT got great at. Andrew Mayne: Writing code is great. Andrew Mayne: I personally have the opinion that we'll never have enough people creating code because there's more things code can do in the world than we can imagine. Andrew Mayne: And even the thing that places the copy. Andrew Mayne: My wife showed me the other day on her skin block or sun block lotion bottle. Andrew Mayne: She showed me on her sun block lotion bottle like some very funny copy about like the ingredients. Andrew Mayne: I said, oh, this is not a place I expected to see this, but that's one of the tiny little places that all of a sudden that you can put more thought into it. Andrew Mayne: That being said, I know that I'm a bit of an optimist because I see all these opportunities are places to go in there.
Adapting to an AI future
Andrew Mayne: What advice do you give people, whatever point they are in life about preparing for or adapting to or being part of the future? Andrew Mayne: I like how Mark just looked right to me. Mark Chen: Oh, no. Mark Chen: You take this. Mark Chen: I can go. Mark Chen: Okay. Mark Chen: I will jump in right now. Mark Chen: Yeah. Mark Chen: No. Mark Chen: I think Nick Turley: the important Mark Chen: thing is you have to really lean into using technology. Mark Chen: Right? Mark Chen: And you have to see how your own capabilities can be enhanced, how you can be more productive, more effective by using the technology. Mark Chen: I fundamentally do think that the way this is gonna evolve is you will still have your human experts, but what AI helps the most is the people who don't have that capability at a very advanced level. Mark Chen: Right? Mark Chen: So if you imagine, right, like, as these models get much better at health care advice, they're gonna help people who don't have access to care the most. Mark Chen: Mhmm. Mark Chen: Right? Mark Chen: Image generation. Mark Chen: Right? Mark Chen: It's not producing, you know, an alternative for, you know, experts or, you know, professional artists. Mark Chen: It's allowing people like me and Nick to create creative expressions. Mark Chen: Right? Mark Chen: And so I think it's kind of rising the tide that allows people to be competent and effective at a lot of things all at once, and I think that's kind of how we're gonna see a lot of these tools bootstrap people. Nick Turley: The world's gonna change a lot, and I think truly everyone has a moment where the AI does something that they considered sacred and human. Andrew Mayne: Know a guy that got vested and or felt very threatened about his achievements in code and abilities. Nick Turley: Well, that happened for me a long time ago. Nick Turley: Let's be talking about Andrew Mayne: someone else in the room. Mark Chen: Oh, yeah. Mark Chen: I mean, yeah. Mark Chen: It's definitely better than me at a lot of code problem solving, for sure. Mark Chen: Yeah. Nick Turley: Right. Nick Turley: So I think it's deeply human to to feel some level of awe, respect, and maybe even fear. Nick Turley: And I think to Mark's point, be it actually using this thing can demystify it. Nick Turley: I think we all grew up or, you know, learned about the word AI in a world where AI meant something pretty different from what we have today. Nick Turley: You've got these algorithms that, you know, try to sell you things, try to do things, and or you've got movies, you know, where the AI takes over, etcetera. Nick Turley: And, like, that term means so many things to different people that I'm entirely unsurprised that, you know, there's fear. Nick Turley: So. Nick Turley: Actually using the thing is is, I think, the best way to have a grounded conversation about it. Nick Turley: And then I think from there, the best way to prepare, Nick Turley: I think there's some degree to which you need to understand the products and keep up, sure, but I think things like prompt engineering or sort of understanding the intricacies of this AI, they're kinda not the right direction. Nick Turley: I think sort of there's fundamental human things like learning how to delegate. Nick Turley: That is incredibly important because increasingly, you know, you're gonna have an intelligence in your pocket that it can be your tutor, adviser, it can be your software engineer. Nick Turley: It's much more about you understanding yourself and the problems you have and how someone else might help than a specific understanding of AI. Nick Turley: So I think that's gonna be important. Nick Turley: Curiosity, I mentioned it earlier. Nick Turley: I think asking the right questions, you'll get you only get what you put in. Nick Turley: Right? Nick Turley: That's important. Nick Turley: And I think fundamentally being ready to learn new things. Nick Turley: I think the more you learn understand how to pick up new topics and and domains, etcetera, the more you're gonna be prepared for a world where, you know, the nature of work is shifting much faster than it's ever shifted before. Nick Turley: So I'm prepared that my job, you know, in product is gonna look different or not exist at all, but I am looking forward to picking up something new. Nick Turley: And I think as long as you bring that perspective, you're well set up to leverage AI. Andrew Mayne: I think we sometimes over index on Sometimes certain jobs go away because we don't really need a lot of, you know, typewriter repair people anymore. Andrew Mayne: Right? Andrew Mayne: And then certain kinds of coding jobs are probably gonna go away. Andrew Mayne: But like I said, I think there's way more opportunity for coders or people to create code however it's done.
The opportunities ahead: healthcare, research
Andrew Mayne: And you mentioned the health field. Andrew Mayne: And that's one of things I hear people like, oh, when we replace everything with AI, well, I would be very happy having an AI diagnose me, operate on me, and probably do everything else. Andrew Mayne: But I do want somebody there to talk me through the procedure and hold my hand. Andrew Mayne: But also, want people asking questions. Andrew Mayne: Know, every day I take a bunch of vitamins. Andrew Mayne: Is this the right time of day to take it? Andrew Mayne: You know? Andrew Mayne: I can't bother my doctor with all these silly little questions. Nick Turley: I really don't think you end up displacing doctors. Nick Turley: End up displacing not going to the doctor. Nick Turley: You end up democratizing the ability to get a second opinion. Nick Turley: Very few people have that resource or know to take advantage of a resource like that. Nick Turley: You end up bringing medical care into pockets of the world where that is not readily available, and you end up helping doctors gain confidence. Nick Turley: I've often heard from doctors that they already talk to existing colleagues to get a second opinion. Nick Turley: In some cases, that's not possible, and I think you'd be surprised by the number of doctors that use ChatGPT. Nick Turley: Now on things like medicine, there's work to make the model really, really good, and we're excited to do that There's also work to prove Nick Turley: that the model is really good because I think you're not Nick Turley: gonna trust it until there's some degree of legitimacy. Nick Turley: And then there's work to explain the areas where the model might not be good because increasingly, once it gets to human and then super human level performances, it's hard to frame exactly where it will fall short, which is also hard to sort of reckon with. Nick Turley: But nonetheless, I think that opportunity is one of the things that gets me up in the morning. Nick Turley: Education might be the other one, and I think there's a tremendous opportunity to help people. Andrew Mayne: What do you think is gonna surprise us the most in the next year to eighteen months? Mark Chen: I honestly think it's gonna be the amount of research results that are powered, even in some small way, by the models that we've built. Mark Chen: And one of the kind of quiet things that's taken the field by storm is the ability of the models to reason. Mark Chen: And you already see some research paper. Andrew Mayne: I'm gonna make you explain when you say reason. Mark Chen: Yeah. Mark Chen: So this fits into the Andrew Mayne: I want you to reason through. Andrew Mayne: The question as you explain reasoning. Andrew Mayne: Out loud. Andrew Mayne: Yeah. Andrew Mayne: Think Exactly. Andrew Mayne: Your traces. Mark Chen: Yeah. Mark Chen: This really fits into this agentic paradigm that we were talking about earlier. Mark Chen: And the way that the models approach solving a problem that takes some time to solve is that it reasons through it, much like you or I might. Mark Chen: Right? Mark Chen: If I give you a very complicated Andrew Mayne: puzzle reason probably much better than I do, Marco. Mark Chen: I mean, I think I'm flattered. Mark Chen: With a yeah. Mark Chen: Like, a complicated puzzle. Mark Chen: Right? Mark Chen: You might think to yourself for instance, let's just use a crossword puzzle. Mark Chen: Right? Mark Chen: Like, you might think through all the different alternatives and what's consistent. Mark Chen: You know, is this row kind of consistent with that column? Mark Chen: And you're searching through a lot of alternatives. Mark Chen: You're backtracking a lot. Mark Chen: You're trying a lot of your hypotheses. Mark Chen: And then at the end, right, you come up with a well formed answer. Mark Chen: And so the models are getting a lot better at that, and that's what's powering a lot of the advancements in math, in science, in coding. Mark Chen: So this has reached a level where, today, in many research papers, people are using o three almost as a subroutine. Mark Chen: Right? Mark Chen: There's subproblems within the research problems they're trying to solve, which are just fully automated and solved through plugging into a model like o three. Mark Chen: I've seen this in several physics papers. Mark Chen: Talk to physicists even where they're like, wow. Mark Chen: Like, had this expression that I couldn't simplify, but o three made headway on it. Mark Chen: And these are coming from some of the best physicists in the country. Mark Chen: So I think you're gonna see that happen more and more and more, and we're gonna see just acceleration in progress in fields like physics and mathematics. Nick Turley: It's a hard one to beat because, you know, I would swap many things we do in exchange for making a true, you know, significant, you know, scientific advancement. Nick Turley: But I think we can Nick Turley: we can have multiple of Nick Turley: these things. Nick Turley: I think for me, it's the fact that any well described problem that is intelligence constrained Andrew Mayne: Mhmm. Nick Turley: I think will be solved in products, and I think we're fundamentally just limited by our ability to do that. Nick Turley: So what that mean is, like, you know, companies in the enterprise, there are so many problems that are fundamentally hard that the models are not smart enough to do yet, whether that's software engineering, whether that's running data analysis, whether or not it is providing amazing customer support. Nick Turley: There's all these problems that the models fall short at today that are very, very easy to describe and evaluate, and I think that we'll make tremendous progress at those. Nick Turley: On the consumer side, these problems exist too. Nick Turley: They're a bit harder to find just because consumers are worse at telling us exactly what they want. Nick Turley: That's the nature of building consumer products, but I think it's very, very worthwhile where, you know, there's many hard things we do in our personal life, whether it's doing taxes, whether or it's planning a trip, whether or not it's searching for a high consideration purchase, whether or not that's a house or a car or a piece of clothes. Nick Turley: All of those things are problems where we need just a little bit more intelligence and the right form factor. Nick Turley: So I think the other thing that's gonna happen in the next year and a half is you'll see a different form factor in AI evolve. Nick Turley: I think chat is still an incredibly useful interaction model, and I don't think it's gonna go away, but increasingly, you're gonna see more of these sort of asynchronous workflows. Nick Turley: Coding is just one example, but for consumers, it might be sending this thing off to go find you the perfect pair of shoes or to go plan a trip or to go finish your taxes, and I think that's gonna be exciting, and we're gonna think of AI a little bit differently than just a chatbot. Andrew Mayne: One of my favorite examples, both from a utility point of view capability and then UI, was deep research, and deep research is probably the best example we maybe have of probably agentic sort of model use right now because it used to be you'd ask for a model to tell you about a topic.
Async workflows and the superassistant
Andrew Mayne: It would you'd either get the data or just do a big search of the Internet, and then it would just summarize all that where deep research will go find some set of data, look at it, ask a question, then go find some new data and come back to it and keep going on. Andrew Mayne: And I think the first time I used it, other people used it, like, wow, this is taking a while. Andrew Mayne: And then you added a UI change so I can go away and go do something else. Andrew Mayne: And then the lock screen on my phone will show me this is working, which was a paradigm shift. Andrew Mayne: And I talked to Sam here about that. Andrew Mayne: And Sam said that was a surprise to him, was the fact that people would be willing to wait for answers. Andrew Mayne: And now I've seen a new metric for models as how long a model can spend trying to solve a problem, which is a good metric if it ultimately solves it. Andrew Mayne: And that's, has this been an update to you in how you think about these things? Andrew Mayne: The idea of like, oh, we don't just want And I guess you talked about this before about agentic, and the idea that it's not just give me the answer. Andrew Mayne: It's like, take your time. Andrew Mayne: Get back to me. Nick Turley: I think to build a super assistant, you gotta relax constraints. Nick Turley: Like today, you have a product that is entirely synchronous. Nick Turley: You have to initiate everything. Nick Turley: That's just not the maximally best way to help people. Nick Turley: Like, you think about a real world intelligence that you might get to work with, it has to be able to go off and do things over a long period of time. Nick Turley: It has to be able to be proactive. Nick Turley: So I think there's, like we're sort of in this process of relaxing a lot of the constraints on the product and on the technology to better mimic a very, very helpful entity. Nick Turley: The ability to go do five minute tasks, you know, five hour tasks, eventually five day tasks a very, very fundamental thing that I think is gonna unlock a different degree of value in the product. Nick Turley: So I've actually not been that surprised that people are willing to do that. Nick Turley: Like, I don't really wanna be sitting around waiting for my coworker either, and I think if the value is there, I'd gladly be doing other stuff and come back. Mark Chen: Yeah, and we really don't do it just because, right? Mark Chen: We do it out of necessity. Mark Chen: The model needs that time to solve the really hard coding problem or the really hard math problem, and it's not gonna do it with less time, right? Mark Chen: You can think about this as, I give you some kind of brain teaser. Mark Chen: Your quick answer is probably the intuitive wrong one, and you need that actual time to work through all the cases to, like, are there any gotchas here? Mark Chen: And I think it's that kind of stuff that ultimately makes robust agents. Andrew Mayne: We've seen kind of there's the paper of the moment where somebody comes out and says, ah, I found a blocker. Andrew Mayne: And I remember there was one a month or so ago, and they said models couldn't solve certain kinds of problems, and it wasn't hard to figure out a prompt that you could train into a model and could solve those kinds of problems. Andrew Mayne: And we had a new one that talked about how they would fail at certain kinds of problem solving ones. Andrew Mayne: And that was kind of quickly, I think, debunked by showing that the paper kind of had flaws in there. Andrew Mayne: But there are limitations. Andrew Mayne: There are things that there might be some blockers or things we don't know are going to be there. Andrew Mayne: I think brittleness is one of the things. Andrew Mayne: There is a point where models can only spend so much time solving a problem. Andrew Mayne: We're probably at a point where we're only having the model maybe two systems watch each other, and we have to think about how a third system stops, the wait for things to break down. Andrew Mayne: But do you see any blockers between here and where I'm getting the models that are going to be solving, doing things like coming up with interesting scientific discoveries? Mark Chen: I think there are always technical innovations that we're trying to come up with. Mark Chen: Fundamentally, we're in the business of producing simple research ideas at scale. Mark Chen: And the mechanics of actually getting that to scale are difficult. Mark Chen: Right? Mark Chen: It's a lot of engineering, a lot of research to kind of figure out how to kind of tweak past a certain roadblock, and I think those are always gonna exist. Mark Chen: Right? Mark Chen: Every layer of scale gives you new challenges and new opportunities. Mark Chen: So, you know, fundamentally, approach is the same, but we're always encountering new small challenges that we have to overcome. Mark Chen: Right. Nick Turley: Just to build on that, I mean, the other business we're in is in building great product with with great products with these models, and I think we shouldn't underestimate the challenge and amount of discovery needed to really bring these ever intelligent models into the right environment, whether or that's giving them the right sort of action space and tools, whether or not that's really being proximate to the problems that are hardest, understanding those and bringing the AI there. Nick Turley: So I think there's, you know, the technical answer, but I think there's also the the, you know, real world deployment, and I think that always has challenges that are very, very hard to predict yet worthwhile and part of our mission to do this all.
Favorite ChatGPT tips
Andrew Mayne: All right, last question, and I'll begin. Andrew Mayne: It's what's your favorite use or tip for ChatGPT? Andrew Mayne: Mine is I take a photograph of a menu, and I'm like, help me plan a meal or whatever if I'm trying to stick to a diet or whatever. Nick Turley: See, I really want that use case, but I've been trying it for wine lists, and that is my eval on multimodality. Nick Turley: Still doesn't work. Nick Turley: Really? Nick Turley: It keeps embarrassing me with, Nick Turley: like, hallucinated wine recommendations, and I go over it, and they're like, never heard of Nick Turley: this one. Nick Turley: So I'm glad yours works. Nick Turley: I But for me, that's the that's still a use case. Andrew Mayne: Well, mean, it could I maybe the line lens is too dense. Andrew Mayne: That was a problem with the Operator, was it, like, originally was the division models, that too much dense text, it just loses its placement. Mark Chen: Yeah. Mark Chen: I mean, speaking to Deep Research, I love using Deep Research. Mark Chen: And, you know, when I go meet someone new, when I'm gonna talk to someone about AI, right, I just preflight topics. Mark Chen: Right? Mark Chen: I think the model can do a really good job of contextualizing who I am, who I'm about to meet, and what things we might find interesting, and I think it really just helps with that whole process. Andrew Mayne: Very cool. Nick Turley: I'm a voice believer. Nick Turley: It's still got I don't think it's entirely mainstream yet because it's got many little kinks that all add up, but for me, you know, half of the value of voice is actually just having someone to talk to and forcing yourself to articulate yourself, and I find that to sometimes be very difficult to do in writing. Nick Turley: So on my way to work, I'll use it to process my own thoughts. Nick Turley: And with some luck, and I think this works most days, I'll have the restructured list of to dos by the time I actually get there. Nick Turley: So voice for me, it needs to be the thing that, you know, I both love using and wanna see improve over the next year.