# Build Hour: Codex

## Метаданные

- **Канал:** OpenAI
- **YouTube:** https://www.youtube.com/watch?v=WvMqA4Xwx_k
- **Дата:** 08.09.2025
- **Длительность:** 53:08
- **Просмотры:** 43,266

## Описание

Codex is now one agent for everywhere you code — connected by your ChatGPT account. This Build Hour is a hands-on walkthrough of how to use all its features, including the new IDE extension and code review.

Dominik Kundel (Developer Experience) and Pranav Deshpande (Product Marketing) cover:
- What’s new with Codex? IDE extension, revamped Codex CLI, code review, and local to cloud handoffs
- How Codex works: where you can use it, and where it runs
- Live demos for pair programming with Codex CLI and IDE extension
- Best practices for structuring your codebase and delegating tasks to the - Codex cloud agent
- Live Q&A

👉 Follow along with the code repo: https://github.com/openai/build-hours
👉 Codex docs: https://developers.openai.com/codex
👉 Agents.md: https://agents.md/
👉 Codex CLI repo: https://github.com/openai/codex
👉 Sign up for upcoming live Build Hours: https://webinar.openai.com/buildhours/

## Содержание

### [0:00](https://www.youtube.com/watch?v=WvMqA4Xwx_k) Segment 1 (00:00 - 05:00)

Welcome everyone uh to another episode of Build Hours. I'm Pranadesh Mande. — Hi, I'm Dominic Kundal. — And we're your hosts for today as we dive deep into Codeex, our software engineering agent. Before we get into the content, just a quick plug for build hours. This is our live virtual event series for builders and developers to help you get the most out of OpenAI APIs, models, products, um and anything else that might be helpful. Uh we have one every month. uh features experts from the company but also sometimes users and customers as we talk through all of these things. So if you want to learn more um check out the link below. So if you've been keeping track of what's happening over the past month when it comes to coding at OpenAI um you'll have maybe noticed that it's been a pretty big month for us. Um we launched GPD5 um which is crazy to think it was only a month ago um our best scoring model. Um, and over that time, we've also been improving the Codeex CLI to really harness uh GPT5's coding capabilities. Um, and then finally, we had a pretty big release for Codeex last week where we rolled out a slew of new features, including a new ID extension. So, we figured now was a great time to actually have this event to dive deep into what you can do with uh everything that's new with Codeex. So we have a bunch of exciting demos that Dom has put together for you. We'll also discuss best practices um around how to structure your codebase and workflows to get the most out of codeex. And we also have Q& A. So um you can actually submit questions to us during the course of this event using the Q& A tab um in the web in the video platform that you're currently in. Um those questions come straight to us. our team will either answer them directly in chat or and we'll save a few to answer live at the end of the event as well. So before we get into the actual u meat of the demos, we also figured it would be a good time to um reintroduce Codex. — Codex hasn't been around for that long. It's honestly hasn't been around even for a year and it's changed a lot. Um so today I just wanted to walk you through where we were and where and how far we've come because that really helps um set the right context for how to understand um how to get the best out of Codeex. So we first released Codeex in April with the Codeex CLI. Um this was our lightweight open- source coding agent um that you know a lot of you love and use today. Um at the time you could use it with JIG GPT or your API keys. And then we followed that up with codeex and chat GPD which was an asynchronous cloud coding agent that connects to GitHub runs code remotely on your behalf gives you a PR. These two experiences were you know powerful but they didn't really work well together. They were kind of siloed um and not really in line with you know how a lot of people build. So over the uh past few months, we've been really working on addressing that and bringing these experiences together while also adding um a lot of new capabilities. So that's what we did last week. Codeex should now feel more and more like one agent for everywhere you code. We did that by announcing a new uh IDE extension that works in VS Code. It works in cursor, works in any fork that's compatible with VS Code and it really brings um the same functionality of the CLI into your IDE so you can work with Codex alongside your code more easily. We've also been improving the Codeex CLI almost daily. Like because it's open source, you can actually go to the repo and see how many releases we've had. Like I think it's like one release. — The team is definitely shipping a lot. — Yeah. They're cooking like one release — last night. — Yeah, exactly. Like every few days. So I'm really hope making that better, improving the UI, improving the harness, just making it feel more reliable and capable. U and then code review, which is a new feature from last week where when you enable it in GitHub, you get automatic code reviews for new PRs. Um and then finally, a feature that I'm personally quite excited about because it's so novel. Um, and so actually we'll see how people use it, which is the ability to hand off tasks while you're working to the cloud and pull them down um to make codeex feel more like that agent that's available for you um everywhere you code and all of this is connected by your chat GPT account um — and it's part of the subscription right — yes so if you're on plus pro team which is now business enterprise or edu codeex is available to you just sign in with chat gpt in any of these interfaces and you can start using it. Um, with that there's also a change in how codeex works. Um, and so you know thought it also be

### [5:00](https://www.youtube.com/watch?v=WvMqA4Xwx_k&t=300s) Segment 2 (05:00 - 10:00)

helpful to discuss like a mental model for codeex that can help you understand where to use it and how to get the most out of it. So the best way to think about codeex now as it's structured today is along two dimensions. One is where you as a developer use codeex. So we just kind of went over this, right? Like if you like to work in your IDE and see your code, you have the ID extension there. If you're someone who just likes to work in the terminal, you don't necessarily need to check in on code constantly or maybe you don't like you don't mind switching tabs, the CLI is the way to go. Um, if you're doing code reviews, it's all in GitHub. Um, and then if you just want to schedule or kick off tasks asynchronously and review them later on, you can do it on web and you can also do it in the chat GPT iOS app. And then there's also where codeex runs like where the actual work that CEX does for you happens. Um, currently happens in two places. You can work with CEX locally on your machine. So this is more like when you're pair programming with CEX, it's executing code. it's modifying environment on your machine. And then there's remotely in a secure cloud sandbox. So this is one that works asynchronously. It runs on its own machine um and downloads your code and um does a bunch of work for you to give you a PR that you can um either merge or continue working from. Um and so with that um we also hope that uh how you think about codex and the naming for codex um has become simpler. um understandably when Codex CLI and codex and chat GPT didn't work — and Codex one the model um there's also the Codex model from 2020 a lot of things named Codex right so the confusion was quite understandable um it's been a big um focus for us to simplify that simplify the naming simplify the mental model for how developers think about codecs and so now that it's all connected via chat GPT account uh we really want to sort of make sure this meme goes away and um Codeex is just looked at and feels like more importantly one product with one name. It's kind of like GitHub, right? Like you use GitHub in the CLI, you can use it in the web, on the mobile. It doesn't really matter where it runs or where you use it. It's just it's still GitHub. So that's what we hope CODC starts to feel like now that we've brought things together and we continue shipping in this direction. But that was enough of a monologue for me. Um, let's actually get into the fun stuff. So, Dom, why don't you take us away with the demos? — Cool. Awesome. So, yeah, I'm going to show you a couple of ways of how all of this really comes together. And like as Pranov said, there's a bunch of different aspects and interfaces on codecs, but uh there they should all really come nicely together. And hopefully you can see that throughout the next 25 minutes. Um, by me actually giving you a couple of examples of how I use codecs to maintain the agents SDK in Typescript. And because the agents SDK is open source, you should be able to actually see some of these things in action. Uh, including some of the poll requests we're going to create as part of this session. Uh, so we're going to show you how you can use Codeex to pair with it locally using the new IDE extension. you can delegate uh tasks to codecs at in the cloud as pronov said um how code review comes into play and then also when you might want to use codecs on your phone and so oh with that let me go into my editor here and I have codecs already installed uh you can find it in the extension uh marketplace both on VS code or in cursor in my case or wind surf etc and so with that I have it on the right side here and we can start asking questions in the OpenAI agents. js repository, I'm going to start off by just working locally. — Yeah. — Um so similar to how the codeex CLI works if you've used that before and I'm going to use it in chat mode. That means it's just going to answer questions like what is this repo about? Um using the local context of the repository. So in this case it's going to use GPT5 on medium reasoning. You can change that here uh to traverse the codebase the same way that an engineer would do. And — and and this is a live like codebase, right? This is actually like something you're working on. People are using it's the agents SDK. — Exactly. It's on GitHub. Uh you can see here it actually got some information. We can go back and see what it did. Um in this case, uh it's giving us like an overview of the repository that it's a monor repo. uh using PNPM all this is helpful but we launched last week addition to codeex the real time API and G and as part of that I had to do a lot of changes to the agents SDK including updating a lot of the examples so why don't we ask like

### [10:00](https://www.youtube.com/watch?v=WvMqA4Xwx_k&t=600s) Segment 3 (10:00 - 15:00)

what is the most comprehensive real time API demo or example in this repo um so again it's going to go in and traverse but this one is a bit more complicated because I didn't ask it like to just figure out like find an individual file. It actually has to go through search the file similarly to what we've done — uh what you would do as an engineer. It start starts to read additional files to get context and hopefully find an answer on doing this. And it's not a nontrivial uh task or not a trivial task here because um this project actually has a bunch of different packages here that it has to traverse to find that answer. And since we're asking about things like what is the most comprehensive, it actually has to compare the level of features that are implemented because — you can see it doing that there. It's like exploring the examples, right? And it's comparing them. — Yeah. And they're actually quite different. Like again, you can check all this out. You can check my story here by going on to the GitHub but GitHub repo and check it out there. But it's going through and traversing the examples the same way that we would do. Um and this goes to one of the first tips though that I would have which is structuring a repository can be extremely helpful for how you're dealing with tasks. Um, in this case, having a monor repo with clearly named projects is incredibly helpful because it allows codeex to better navigate it, but also understand how things might work together or separate them entirely by working on like multiple tasks at the same time. — So, is that would you say that's the right answer? — Yeah, that's the great answer. And it actually gives me a command here to run this. So, I'm going to start up the server. Um, and we can see here this is just a very basic real time API example here. If I connect to it, um, I'm going to mute it. — Let's try this again. — All right. So, I have a start camera button here. And like I had to do this last minute. I actually used GBT5 to build this. — But I have this in image input feature here where we can show things and capture an individual image. But I really want this to be something where like it captures an image every second so that we can ask um questions almost like uh you're passing in video. So what I'm going to do here is go back and let's actually give it a task this time to work on. And we're going to run this in agent mode locally again. — Yeah. Um — and the cool thing about the modes is like the chat mode is like read only. It doesn't make any changes. agent, it kind of decides itself what changes to make and then ask you for approval and then you kind of have like the yolo mode which is full access. — Yeah. And full access as you can see here can actually write things even outside of the workspace which can be sometimes what you want it to do and sometimes it can have undesired uh consequences. So I would recommend sticking with agent unless you really know what you're doing. But in my case, I want to actually update this new um this UI. And so I'm going to say update the page file here. Um and you can reference things using at. And I'm going to say um I'm basically just going to describe exactly sort of the types of changes, but not going too much into detail on how. So um update the page to allow for continuous image input on one frames per second basis. U make it configurable so that it's otherwise um static image input in other apps and that's because it's like the in this case this um the components are shared. So I want codeex to figure out how to like modify the react components. We'll see that later. But one of the cool things is while this is running locally is we don't have to wait, right? — Um so we can actually kick off additional tasks here. And in this case I want to run this in the cloud. And you can see here um I have a couple of different environments which is these containers that codeex runs in. I'm going to pick this one because I have a couple of tools set up that I know will help the agent verify its work. — And in this case, I want to work just off the main branch here. I'm going to say update the MC uh update real time Twilio to have the same MCP examples as real time next. So, in this case, um we introduced MCP support for the real time API and I didn't get around to adding that to all of the demo apps. So, I'm just going to kick off this task here. And I'm running it actually in four attempts. So, if you've never tried that, like super helpful. Um, but while

### [15:00](https://www.youtube.com/watch?v=WvMqA4Xwx_k&t=900s) Segment 4 (15:00 - 20:00)

this is running, let's kick off one more here. I'm going to run this one in the cloud again. Um, and say update. Uh, there's another one that I want to do um to use the same MCP examples. — And then the four attempts feature um we call it best defend, right, internally. And it's actually cool for these kind of tasks cuz each attempt you can actually just pick the best it simultaneously gives you four different PRs and you can just pick the best one. So — yeah. And the interesting thing is that like once you find one that you like and it might still not be perfect, you can then like do another best of end on that one. So sometimes you go down this like fork. But it's been like super helpful because especially like one of the things that it saves me time on is to not waste time on prompt engineering and trying to get the absolute right task. Yeah. — Um definition and instead I just fire off four and like hope one of them is going to be correct. — Um so you see here we we're currently running three tests at the same time, right? We have our local one and we can check in here and we can see sort of like what it's doing. Um, and we have our cloud tasks here that we can go in and like also check in — on mobile. Um, and we can see here everything it's been doing so far. — Um, but while this is working, one of the other tips that I have is like you kind of have to change your mental model. Yeah. Um, so as you're with sort of a lot of the VIPE coding tools or other things like you're still ultimately the person working on it and you just have like your AI help you pair program but in this case like you want to almost put yourself instead of an individual contributor role into the role of like an engineering manager or an architect where you think about how do I structure the tasks? What are tasks that are going to come up um you know down the line? can I kick them off right now? Or even with things like where you might encounter a bug while you're solving a problem rather than putting it in the backlog and having it die there, uh you can just kick off a cloud task and have it work on the side and then later on when you come back you can go in and check on it. You can pull it down, we'll see that in a second, uh into your existing environment, test it out and then like pull in submit the PR. So, it's really cool to like get more done, but it requires a bit of that like mental change. — Yeah. And then also like I guess one thing we should also mention is how you set it up to actually connect your local environment with the cloud environment, right? Like it's pretty easy as long as you have the same git repository initialized locally and um in the container in the cloud codeex automatically just syncs the two. Um and over here I think uh this is the DevX team and everyone on the DevX team has their own branch of the agent SDK right for example in mine — their own environment sorry — exactly so for example in mine if we look at it um I have it here we can actually see that um I had to work a lot on like par between the Python SDK and the agent agents SDK and TypeScript and so what I modified is I'm actually cloning the Python one into my environment as well that means my environment is a bit slower But what that also means I can ask it questions about this other one. So for example, if I want to wanted to implement a feature uh where I it should reference the Python code, it can actually do that. Um so this is my environment, but then like Cass on our team for example has his own environment and we have a more generic one. And so you can have all of these different environments that are tied to the same GitHub repository. Um or you can share them if you have like a fairly standardized setup. — So looks like the local task completed. So — let's take a look here. — Um so one of the things we can do is we can see here it like what I really like is that codeex actually gives you a really nice uh summary of things. Um so in this case it actually um did these environment variables so that you can define how on like the cadence that it's modifying things. — Um and then um it has a use effect hook here. Um one of the things that is nice about this is like I actually don't really like this solution. Um and I can continue to do follow-ups on this. So, in this case, I'm going to just say like um actually I want you to modify the um I think there's like a if we look at this um camera capture um that to allow for FPS and continuous mode.

### [20:00](https://www.youtube.com/watch?v=WvMqA4Xwx_k&t=1200s) Segment 5 (20:00 - 25:00)

Um, so we're just going to kick that off and let that run. And so sometimes you have these things where like you, especially for things that are like more complicated, you don't necessarily want it to like work on a long time and then you realize it completely went off track. Um, so — you can easily just undo those diff that diff. Right. — Exactly. Uh, we can undo that. But one of the things I want to show you that I did earlier was sometimes you have these like larger migrations. And so I had uh GPT5 cook me up like a nicer UI. Um and I want this to be implemented later on, but — I actually think this is um — like not a thing that I want to oneshot and like — uh wait forever. So what I did is I asked Codeex, you can give it an image input here. Um, I gave it the image and I asked it to create a plan — and then I kicked that off and uh, it went ahead and wrote this pretty — nice — comprehensive plan uh, that covers all of the things that need to be changed to implement this. And so the nice thing is we can go in, we can modify things here and uh alternatively we can just go in and um once we're happy with this run this actually in the cloud and just say like we want to u kick off a new task here — um do we call it plan? Yeah. Um and we can when we — say implement this, right? — When we do cloud, we can actually say use those local changes. So you don't even have to push it to GitHub, right? — We can just say um — implement this. — So it just automatically takes the latest git state. — Exactly. — And pushes it to the cloud and keeps working from there. — And then it gives you a git diff that you can apply again uh later on. So I kicked that off. We can check in on some of our um virtual ones though. So, this one seems to be done. Um, and we can see here, um, it updated the rate me. It implemented all of the things. Um, you can see it has like all of the MCP tool calling here. So, this looks great. I'm just going to create a PR here. Um, and one of the things that you mentioned earlier that we can do now is we can actually have codeex also code review your code. Um, so we have it set up on the agents SDK in a way where we can have um, automatic code reviews on any PR that gets open. So in this case, my colleague Cass opened a PR here. I haven't gotten around to reviewing this as you can see. Um, but Codex already went and actually reviewed it and it found this little edge case scenario that honestly I would have not caught if I would have looked through it. Um, and so rather than fixing it himself, Cass just kicked off Codeex with fixed comments and it gets the context of the whole code review including — um the comments Codeex did itself and it kicked off another task here and um Cass shared it with me — and this looks honestly good. So he's probably going to add that to the PR and I'm going to feel much better at like merging the PR in. Um — and then actually what's cool about Codex code review is um it's not just static analysis, right? Like because Codex has its own computer, what it does is it looks at the PR, diff in the PR. Um it looks at your codebase. It decides does the diff does the PR match the intent of the PR, does it match the changes? it figures out whether it needs to run code itself to validate those changes and then it provides a code review. So it's kind of like a really good um collaborator teammate who actually has time for code reviews instead of just saying looks good to me. — Yeah. No, it's been definitely great like seeing it catch all these little uh edge cases or things that I might have not even thought about um because it has all this context and it's operating in your environment. Um so it's been really helpful for that. Um let's briefly check back in on some of our tasks here as well. Um so first of all like we have the um the other demo one here and it's completed it and you can see here one of the nice things is we can actually apply all of these changes locally. So if you want to test these out this is a great way for you to like test these or then like revert the changes. I think I might have just messed up here slightly. Um, no, actually this is the — um this is the other task that we kicked off. And you can see here that like this is really what might be intimidating initially, but you're going to get very quickly used to just working on like five things at the same time because you can just kick them off and they're no longer like occupying space in your mind

### [25:00](https://www.youtube.com/watch?v=WvMqA4Xwx_k&t=1500s) Segment 6 (25:00 - 30:00)

— where like every time I'm thinking of a task that I have to do or a feature, I'm just going to kick it off — and like out of sight, out of mind and then when I later have the time to actually look at it, I'm going to come back. Yeah, you just need to keep track of which code base you're in, what branch you're on, and just the more familiar you are with the codebase, the more you get out of it, right? Because you can just like zoom through stuff. Yeah. — Uh by just orchestrating tasks with codeex. — And so in this case, it seems like um it uh changed all of that. It still kept the environment flags, which is nice. So that we can configure that. Um I'm just going to create a PR here. Um, and I'm not going to stage this. We actually we kicked off the plan like we don't really need this anymore. So, I'm just going to get rid of that. Um, and then I'm going to um push that off. And we can go in and create a new pull request here. And that should kick off the whole review on on GitHub later on. Um, so those are a couple of different things. One of the other things I wanted to give you as an example um where again when inspiration uh strikes you want to actually just kick off a task is a few months ago in June I was actually working on the open model release and one of the things I had to do was build for our harmony parser which is the um parser that handles like all the tokens that were generated by JPss and turns them into like a structured uh array of messages. I had to add a streamable version to it. And I had it on my uh mind and list to do this eventually. But then I think it was like 11:30 at night. I was about to go to sleep, but I had this like problem stuck in my head. And I had an idea how to solve it. And rather than trying to go to sleep, and then trying to remember tomorrow like what I wanted to do, I actually just pulled up my phone, opened the chat GPT app on iOS and wrote like this is the exact thing that I wrote. So like I outlined roughly sort of like what the problem statement was. I um told it like hey this is what I wanted to look in Python. And one of the things to know about this particular problem project is that it's written in mostly Rust with a Python wrapper. — Um, and I'm not a big Rust developer. — Uh, so I just gave it what I wanted it to look like in Python and I was like, okay, like figure out the rest of it. Like figure out how to implement it in Rust. Figure out um what the interface should look like. also use your own judgment on like how we can use as reuse as much as we can. Um and then like create things where uh necessary and it created this poll request that um there was like some change here in like the repo. So like it doesn't show the original pull request anymore. Um, but this if you're going into the Harmony repo, which is actually open source as well, you'll find that a lot of this code still exists because I woke up the next morning and I had code ready that I just ended up creating pull requests for and it worked. Um, so this was like one of those like magical moments that I have where I'm like, I probably have to go in and like it's not it's going to be broken and I going to spend like 20 more minutes like prompting it to fix it. But like it just worked. — Yeah. Um, and it was not an easy problem. Like you can actually go through the logs here and like see it having to like navigate for quite a while. It took seven minutes to solve this. Um, — and now you never have to learn Rust. — Exactly. Um, at least mostly I can review Rust now, but then I can send codeex on reviewing that, right? Um so yeah that was like another tip that I would have is like as you are um tackling problems try to not wait until you get back to your computer or other things but like if you actually set up an environment on codeex you can then sort of kick off tasks when inspiration hits. Sometimes I do this when I'm like um on my way to work. I quickly send off a task by the time that I get to the office. I already have a working version or at least a draft that I can work off by either pulling it into the IDE and taking uh additional turns there or I might just kick it off uh like kick off another task to refine it by asking for changes. — Yeah. And then I guess speaking of environments, one thing maybe we should also discuss is the use of markdown

### [30:00](https://www.youtube.com/watch?v=WvMqA4Xwx_k&t=1800s) Segment 7 (30:00 - 35:00)

files, right? Like there's agent. So maybe starting with agents. md. Um so when you were discussing um structuring your codebase u the cool thing is that um openi models are going to increasingly be good at picking up on agents. md as a file in your repo. Um and so when you ask it a question like get up to speed on this repo or like what's going on here it'll know to look for this file. Uh and this is where you document all your conventions about the codebase. Right? So now because of agents. m MD codecs can just get up to speed like in almost like not instantly but like very quickly. — Yeah. And honestly like agents MD files can sort of range from like relatively short in like very standardized repositories to longer where you really want to guide it to be more efficient. One of the things I would say is also very important is to really make sure and we talked about this briefly earlier, but like have a way to have the agent check its own work. So like — right — people are not necessarily always fans of writing tests, but like having um the ability for the model to use tests is incredibly helpful. So you're going to see codecs often uh write its own tests where applicable. So, if we're asking for changes in the um actual SDK part of the agents SDK, it's going to start creating tests just automatically without you having to ask for it because there's already existing tests and it knows how to execute them. For other things, for example, for the examples, um we don't have tests, but we use TypeScript. So, it that's where it can go in and actually compile them and make sure that at least all the types are correct. And all of that increases your likelihood of actually it succeeding because it if it runs into issues, it's going to start fixing those. — Yeah. And I mean, I'm definitely more on the vibe coder end of the spectrum. And what I like to do when I'm working on side projects is just have agents. mmd include some instructions on codecs needing to write tests every time it implements like a substantially new feature. Yeah. — Um, and it does that pretty faithfully and it's cool. I can just like see the tests run um every time something new is ready and that way I don't even have to look at the code which is for live coding. — And one of the things I did yesterday actually was um I needed tests and this goes back to Rust. the codec cli is written in rust and I was contributing a feature there and um I needed to add test coverage and rather than just saying like okay create tests for this I actually had um codeex again run uh a more planning step and create a plan MD file where it would outline what it was going to test and then I kicked off a new task that actually referenced that file to build the test and this way I could first review its test strategy and sort of what test it was going create and then kick it off and make sure that it actually created like high quality valuable test that could be merged in. — Nice. And then you can have agents. md nested, right? So — yeah, — we have different instructions if you have a really complex codebase different instructions at each level for codecs. — Yeah, I think in our like main monor repo I think we have like 80 or something agents MD files. It's helpful especially if you have like different stacks for example within the codebase. Um in our case like we just put everything awesome. Um I think with that let's go back and um talk a bit about sort of the best practices that we covered. — Yeah, let's do it. — Um so we covered a couple of different things here uh throughout this project. So one um really structuring your code and project to allow for collaboration can be incredibly helpful. So think about it the same way that uh you might structure your repository to be helpful for other humans to collaborate on it. If you're structuring it um to have like individual parts that can be worked on in parallel, it also helps codeex to work on multiple tasks at the same time without you running into giant merge conflicts. Um and so having all like for example in Typescript having smaller projects that all are written in Typescript so it can check itself can be incredibly helpful. — I mean on that note actually we had a question come in that was around like how do you make it so that multiple agents can work on your codebase without messing things up and that's the answer to that. — Yeah exactly. So like the we originally started this whole monor repo when we built the agents SDK to just have the convenience of um bundling different packages for end users. But it was incredibly helpful for us to like limit the impact of work where like you could go in and say all right like change everything in the real time part of the

### [35:00](https://www.youtube.com/watch?v=WvMqA4Xwx_k&t=2100s) Segment 8 (35:00 - 40:00)

SDK or change everything in core and it would be able to split that work and not interfere with each other. — Nice. — And like similarly like splitting it up into individual files is going to make things easier. If you're if you have a bunch of agents work in the same file you're going to be have a bad time. The nice thing though is that codeex because it operates much more like a software engineer actually tends to do things like componentizing things much more regularly or sometimes where it knows it could interfere with things. So for example earlier like it gravitated initially to trying to only touch the one file that I told it to. So it doesn't mess with the other files um unless we explicitly tell it to. Um, — the other part is, and I think this is the thing that takes some time to get used to, you really want to get into this mindset of like an architect or an EM where you're much more focused on like delegating tasks that you can't get to immediately and think ahead of time. Like if you're only thinking about the task that you have to do right now, um you're going to have a harder time getting the most out of codeex. — Definitely. Uh the third one is having tests and ways for codeex to check its own work because the more you have whether it's llinters or formatterers or uh tests the more codeex is able to like make sure that what it returns to you — has a high chance of being merged in. — Yeah. And then you can also in um full access mode have tests that are more active, right? like doing like a database test or an API test like Codex can actually run commands to ping the database, ping the API, make sure the code is working. Um, so you can kind of get into this situation where every time you ship a new feature, it's not only just testing the static code, but you know, it's more comprehensive. Um, and so I guess you have to do less of that yourself or just review the test results. — Yeah. And you can even connect codeex uh CLI and Codex IDE to MCP servers. So if you have some MCP servers that help you or codecs with verifying things, you can connect those as well. — Um the fourth one is using codecs to outline plans in as markdown files for complex tasks. Like this has been incredibly helpful because it's almost like a blueprint review that you would do with another engineer where it's just helpful to have a document where you can go through it, figure out like, okay, does this make sense? um iterate on it a couple of times and not necessarily have to do that in um in the chat but actually having an artifact so that like you could kick off potentially multiple tasks with slight variations of the same file rather than you having to like copy and paste things along. — Also like one thing to add here the CLI has a helper command for agents. md. So if you're in your directory whatever directory you want agents. md just do slashinit and — oh yeah — it autogenerates an agent. md for you. Yeah. No, that's that I think that's sort of how our agent MD initially came to be — and then you just kind of do it. — Exactly. You sort of see what works and what doesn't work and where like you might see the agent do things that you don't want it to. Um the other thing is use codeex review on your GitHub. Like honestly, even if you're not using any of the other parts of codeex, this in itself is like a really helpful thing to just like — make sure that every new pull request that gets created automatically has a code review attached to it that just helps you kickstart your own code review. Even — I mean internally at OpenAI codeex reviews pretty much I mean I wouldn't say 100% but a large chunk of our code. Um and yeah, it's just like becoming integral to how teams ship here. So definitely check it out. — Absolutely. Um and then the last one is trigger tasks as sort of inspiration hits uh either with the mobile app or in codeex web or in the IDE uh extension honestly where like as you're seeing things like don't try to like remember them or like put them in the backlog just like shoot off a task on codeex and honestly like — you can treat it almost like your to-do list and as you're working through things sometimes I just like kick things off and they don't even get merged. they're just like creating an inspiration for me and then I go in and like copy the get apply code and like or like pull it down with the IDE extension now um to like have a starting version. — I like I actually like to use uh Codex mobile to like learn about how like for example things work in our Codex CLI repo like I'll just in ask mode say hey tell me about how this thing works and then I don't need to bug you or the engineering team. No, that has been definitely a helpful feature. Um, where I've done this a couple of times where like in code bases that I'm not even familiar with, I would just like go in and like ask it like a couple of questions around how a certain feature works or like is this thing possible

### [40:00](https://www.youtube.com/watch?v=WvMqA4Xwx_k&t=2400s) Segment 9 (40:00 - 45:00)

without having to bug like the product team. — Um, — cool. — So yeah, those are the sort of six tips that we covered that um I think are worth taking away. There's obviously a lot more. We have a how open AAI uses codecs guide um on the startups page if you want to check that out um that has additional really helpful pointers — and we'll follow up with more resources on codeex after this but um yeah that was the demo and then the walkthrough. Um now over to Q& A. Um I think uh we should refresh the slides Dom that way the latest questions are in here. All right, first one. — Um, so, okay. Yeah, the mic the camera is obscuring a little bit, but um, could you please elaborate on the main differences between using C codeex and cursor uh versus using cursor and GVt5? You want to take that? — Um, I think like ultimately it's two different harnesses. Um, and like I found myself sometimes switching between the two. Um but like codeex uh in cursor with it uses GPT5 but like it actually then integrates into the rest of the codeex ecosystem, right? So being able to like pull down your changes, being able to uh kick off tasks um and constantly switch between that like local and cloud environment um is probably like the biggest uh aspect of it. — Yeah. Yeah, for sure. So yeah, definitely I mean use both uh pick the one that works best for you. Um there's obviously a lot of uh power in both approaches. Um but you know obviously we're using a lot of uh codecs here — the long run. — Yeah, I think we covered this a little bit but maybe we can go into it with some more detail. Um so yeah I mean we discussed how you can use different agents how to structure it so that different agents can work on the same file in parallel. Um — yeah you can so one option is to use like get work trees to work on multiple things at the same time. — Um if you're running multiple tests on local it can definitely kind of get in its way. Um, but like that's why I typically like run one task locally and then kick off a bunch of tasks in the cloud. And the nice thing is as you might have seen, you can actually use the local state of your uh project. So if you got to a state where you want to now kick off like three things at the same time, but you don't want to commit it yet, you can kick off those tasks still in the cloud, let them run there, but then um when they're done, you just copy the git apply code back into your ed editor or like hit the apply button in the IDE and then you can iterate like you can merge them in iterate on them that way. And so like that is the thing that I found easiest. But yeah, if you're doing um you can use get work trees to kind of confine work. Also, GPT5 overall is very good with like instruction following. So, it should be less likely that it's going to touch things that it's not supposed to. — Yeah. Like with your example, it's only touch the page — file there. — Yes. Um yeah, you can connect the codec to your local MCP servers. You can also both the codec cli and the IDE share the same configuration. So if you're configuring it on the codec cli is also automatically available for the IDE extension. So you can switch between the two if you want to. — Very cool. And this opens up all kinds of things, right? Like — does Codex allow prioritization of tasks when multiple are running in parallel? — Um I don't think so. like there isn't a like running things like running multiple tasks in parallel locally is a bit more um challenging unless you're using things like get work trees and stuff like in terms of prioritization on cloud like it doesn't really matter because they're all running in their own sort of environment. — Yeah, like each task is just a new in like a new container basically like where it downloads your environment and keeps running. So um they don't really interfere with each other at all. Yep. Okay. Um, let's maybe I mean we follow up with these resources, but um, if there's just maybe do a quick um, read through the Q& A if there's any other questions. I — think there's one on like how can we change the model to 03. I've not tried that on the IDE extension. Yeah. — Um, like on the CLI you can just pass in the M flag if you want to change it. Although we would recommend like it definitely works best with GPT5. Um especially because it has some features

### [45:00](https://www.youtube.com/watch?v=WvMqA4Xwx_k&t=2700s) Segment 10 (45:00 - 50:00)

that really improve the quality of like the tool usage. — Um I think you can change it in the ID extension, but I've not tried it. — Yeah, I haven't. I've been too enamored with GPD5 even think about other models at this point. But in the CLI, you can also um instead of using CH GPD, you can use your API key if you want to. And that actually allows you to use uh the OSS model. — Yep. — Um which you can run locally um or even any of our other models uh like 03 or um 40 um if you want to use those. Um I think there's one question on will codex come out with code review for security vulner vulnerabil vulnerabilities for code. Um so the code review feature is pretty comprehensive. We don't really scope it to any particular type of review. You can actually ask it to do a security review. Um in fact the automatic code review feature works such that it just kind of infers what you want to review but um if you but the tagging it in the PR you can just say at codeex review this PR for security issues and then it'll just focus on that more than anything else. So it's pretty flexible that way. um you should give it a shot and let us know what you think. Um a couple more. Let's see over here. I think this one is interesting around like how has like the software development life cycle changed like I think one of the things that was personally fascinating for me to see with like the um codeex team leading up to the launch was sort of like — everyone was contributing pull requests and changes like it wasn't scoped to like the key engineering team it was like the PR uh the PMs the designers everyone was like contributing and like improving codecs and that was like really fascinating to see sort of like being able to like everyone swarm in and like iterate very quickly on things. And like some of the things that I love seeing is like uh occasionally you see a task uh like a feedback pop up in like a Slack channel and like it would be cool if Codex could do this and the response is just like a link to a running codeex task. Um, and like I've done this similarly with like the codeex with like the agents SDK or other things where I'm getting some feedback and I'm just kicking off a codeex task and like seeing if it can take the first step. So like your backlog like significantly shrinks just by the amount of like tasks that you can kick off to codeex — if you're and like updating public docs that are backed by git like I now I can actually just submit — string changes without worrying about git etc. do it through. — Yeah, you don't even have to like know where it is anymore. You can just like, oh, change uh change like this thing and then you kick it off and like it will find the right place. And if you have a setup where like you can have like um PR previews where like it will like deploy your branch like it's really easy to verify things. — Yeah. Um, one question around like do you have thoughts related to spec first development? Um, if so, any good examples? Um, I think that's kind of what you um showed with the plan. MD. — Yeah, exactly. So, if you have like more complicated projects, I think it's definitely helpful to do sort of a spec first thing. Um, especially because then you can also like potentially break it down into multiple tasks, send them all off at the same time. Um, but it's super helpful especially because with like GPT5, it's so good at like instruction following that um building out that plan first makes really sure that you can like stick to um a plan that both of you are on the same page on. — Yeah. Yeah, definitely. Um and then I guess you could also like set up a folder in your repo with all the plans, right? Like and then have like statuses on whether they've been implemented. That way there's even more context for — Yeah. Um I mean there are some questions on like and maybe I'm just like inferring from this one but like if you're actually actively using uh actively writing code alongside codeex um less of like the prompted to and modify entire files in the IDE extension you can easily like copy code blocks in and ask questions about that right — yeah exactly so you can actually like mark things and like add them as context — um the other thing that is interesting in the ID extension you will notice is like you can leave little comments of like for codecs to implement things and then you get like a little hint that like you can press like send this to codec. — Oh yeah, the to-do keyword. — Exactly. Um any to-do keyword if you mention codec like it's just going to automatically suggest things. — Um one more question is the codec cli agent the same as what's behind the

### [50:00](https://www.youtube.com/watch?v=WvMqA4Xwx_k&t=3000s) Segment 11 (50:00 - 53:00)

codeex IDE extension? — Yes. — Yes, it's the same harness. um just you know using the terminal using the IDE UI so you see all the nice buttons and drop downs that you really I mean it's hard to do in a terminal UI um as well but you know if you like something more bare bones um the CLI is got a pretty you know legible UI now — the CLI also has like nice features we didn't cover around like scriptability so like you can uh write codeex exec and like a command and it will just operate like um on that particular command. So if you have things that you want to automate um this can be incredibly helpful as well or if you want to run like multiple codeexes on in parallel like sometimes that can without you having to like manage everything in your code editor. Um like that can be helpful as well. — Yeah, you can just set it up in a Docker container somewhere and wire up your own automation if you want to. So — that's pretty cool. Um, let's see if there's any others that we missed. Um, I think we've covered most of them. Okay, there's Okay, one question came in right as I said that. Um, would you do you know the answer to this? I don't think we have any pre- or post hooks right now to like get notified um when the task is done. It's definitely something that like we've heard before. So like — um — for sure. Um and yeah, also open telemetry is something we don't have yet. No. — Yeah. But — we'll get there. — Yeah. — Um I think that's it um in terms of questions. as well. Um, before we conclude, uh, first of all, thank you for being here. Hope that was helpful. Um, a lot of what, uh, Dom showed you was on, uh, open source repos. So, you can actually like see the work Codex has done if you just you can go to agents SDK, agents SDK GitHub, and filter the PR with the CEX tag, and you can actually see the PR that Codex has submitted if you want to see for yourself the kinds of code it writes. Um there's also like uh opensource tools like PR Arena. ai where you can see how many codeex tasks exist on public repos. It's cool way to explore that. Um but otherwise um we have some resources here that you can use to um to get started. Um and hopefully this content was helpful. Um before we conclude um — I just need to go to the next slide. Um, yeah, just a quick plug again for build hours. Um, if you're interested in building with OpenAI APIs using codecs, uh, join us on October 9th, uh, where we'll be talking about how to get the most out of the responses API. Um, including, uh, some of our, you know, how to get the most out of it using GPD5. But with that, I think we can conclude. — Yeah, thanks for joining us. — Thank you for joining us.

---
*Источник: https://ekstraktznaniy.ru/video/11260*