Codex checks its work for you

2:24

Codex checks its work for you

OpenAI 11.02.2026 24 145 просмотров 550 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Javi walks through a logging refactor and shows why Codex's self-verification is a step change: the model runs the app, finds the right session, and proves logs still flow. Takeaways: - Codex can validate its work by running tests and launching the app. - It excels at broad refactors that touch many files. - The model can find session IDs and query tools on its own. - Verification collapses a risky manual loop into minutes. When the agent can prove correctness, you can move faster with less risk. Chapters: 00:00 Why Codex has been a step change 00:18 Self-verification: run tests and launch the app 00:52 The task: a logging refactor across many files 01:10 The risk: do not break observability 01:28 How this used to be verified manually 01:35 Ask the model to verify logs end-to-end 01:50 It finds the session ID and queries logs MCP 02:03 Proof: logs still pipe, task done fast

Оглавление (8 сегментов)

Why Codex has been a step change

I've been a huge fan of Codex for a lot of last year. Really dramatically changed how I work, how I build software, and the app has been another step change, and it's made my job even more fun. I trust that it's going to make a lot more progress in one go without, you know, babysitting or handholding. And especially its

Self-verification: run tests and launch the app

improved ability to validate the work that it's done to write the code and then, like, automatically run tests or even launch the app and do checks like that. It means that when I get back to that session and it says that it's done, a lot more of a time, you know, it's not just, hey, I wrote a bunch of code and now, you know, you have to build compiler errors or whatever it is. But actually, you know, this code works and, you know, might need some refactoring or polishing, but, you know, I can immediately start testing this thing that I asked to build. And that's been just transformative for all sorts of work. So this is a

The task: a logging refactor across many files

task where I've been doing a little refactoring related to logging. And this is one of those tasks where Codex can really excel because there is, it's not a complicated task, but it does, it did require modifying a lot of files. And there was also a bit of risk where you're modifying, you know, sort of a crucial component of

The risk: do not break observability

the app where our regression in this case would have meant our logs stop working and, you know, our observability pipeline breaks, right? So our ability to see the logs in the beta version of the app so we can diagnose back reports would break. So then the way that I would have done this before Codex is, you know, I've

How this used to be verified manually

made a change. I'm going to compile the app and run it and look if the logs are there, right? So in this case, I just told the model and we can

Ask the model to verify logs end-to-end

give that a try. I can see it using our logs tool and it's querying some logs. It ran the app and then it tried to find the session ID by

It finds the session ID and queries logs MCP

writing some Python code. It found right here. Nice. And then now it's using the old logs MCP to go and query that. Yeah, so I just came back to our conversation and the

Proof: logs still pipe, task done fast

model's telling me that it ran the command that I told it, it found the session ID and then it ran this and it found some logs statement. So I can tell that after our refactor, you know, logs are still being piked. So awesome. That's a piece of work that it just takes to me. That's very cool. Like 10 minutes on this task.

Другие видео автора — OpenAI

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник