# Live Demo Showcase: Tools That 10x Your Codebase

## Метаданные

- **Канал:** OpenAI
- **YouTube:** https://www.youtube.com/watch?v=-l0OqapibAA
- **Дата:** 08.10.2025
- **Длительность:** 30:05
- **Просмотры:** 12,170

## Описание

The best dev teams scale impact, not headcount. In this session, founders from Charlie Labs, Warp, CodeRabbit, and Jam will demo tools that speed up shipping, streamline code reviews, and keep your codebase clean—turning small teams into force multipliers.

## Содержание

### [0:00](https://www.youtube.com/watch?v=-l0OqapibAA) Segment 1 (00:00 - 05:00)

Earlier this year, a tweet went viral with a new term, vibe coding. Now, it got a lot of polarizing reactions, but I think one thing we can all agree on is it really changed the way we saw how AI was going to transform software engineering and how products are built. But we're not here today to talk about buzzwords or hype or what might be possible. Today, we are here to show you startups that our team has been working closely with and we have seen firsthand how their AI coding tools have helped tiny teams 10x their code base and punch way above their weight. My name is Sarah Urbonus and I lead startup marketing here at OpenAI. I'm thrilled to be joined on stage here today by four incredible founders who have built tools to help you write, review, ship, and fix code faster than ever before and without any of the AI slop. So, please join me in welcoming our first speaker, founder and CEO of Warp, Zach Lloyd. — Let's go. — All right. Hi, I'm Zach, the founder and CEO of Warp. I'm very excited to be here today. Now, you've probably seen a lot of incredible demos today of uh what development agents can do for creating code, but the truth is if you're a professional developer working on a big codebase, it actually can be hard to apply agents to build software. Now, at Warp, our mission has always been empowering developers to ship great life-changing software, and we're trying to use agents to do that now. So for my demo today, uh I'm going to use warp to build something within warp, which is a million lines of Rust code, and try and show you how I approach development on real production grade code bases using agents. So this is warp. Uh I'm going to go ahead and get started with a prompt here just so I can get the agent working and then I'll explain exactly what it is that I'm building. Please take a look at the two attached screenshots. The first one shows the state of the current branch and the second one shows the desired state that I'm trying to build towards. In the desired state, the QR code is at the top of the referrals page and there's a new button called copy invite link that's on the same row as the email form field. Please help me build this. Oh, and by the way, I'm doing this live at OpenAI dev day and I need to do it within the next four minutes and 30 seconds, so go quickly. Also take a look at the reference file to see where the code is. So I just spoke to warp. Uh this is often actually how I start today. Warp transcribes it for me. I owe it a couple references here. So let me go ahead and refer to a file and let me attach some screenshots. Um how do I do that with this setup? Well, there we go. So I drag those in and then I'm going to go ahead and start the agent. And this agent is using GPT5 which is one of the main coding models that we use in Warp and it's very good. So while this is going, let me pop open another tab here in Warp or another pane I should say. And in Warp, our roots are as a terminal. So you can use any terminal pane as either a place to run terminal commands or a place to launch agents. I'm going to go ahead and use it here. Actually, I'm going to run a terminal command which is local version of Warp. I'm just going to show you what it is that I'm working on. So, you can see here in Warp, we have a referrals feature. This is something that if you're a Warp user and you want to get cool t-shirts or swag, and there's a little Easter egg in here. Uh, you can use this and we're adding the ability to have a QR code for people who are demoing Warp on video so people can get their referral link. But our designer, Peter, actually asked me to do this in a different way. So, he wants the QR code to be at the top of the page. And we don't want this link form field. We want an email form field with a copy invite link button here. So this is what I've asked the agent to change for me. So let's go back and while the agent is working here actually let me tell you just a little bit more about what warp is in general. So warp is what we call an agentic development environment. And the idea here is that it borrows some of the best parts of the terminal and IDE all with a vision of like how do you have first class support for building with agents. And so one way of conceiving of this is warp is just a general purpose interface for telling your computer what to do. And if you tell it that it will do it. It has four parts. We have a coding agent which is actually really good. It's a top three agent on both suede bench and terminal bench. We have facilities for multitasking with agents which I kind of just showed you. Uh we have a really

### [5:00](https://www.youtube.com/watch?v=-l0OqapibAA&t=300s) Segment 2 (05:00 - 10:00)

nice just terminal interface. So if you just want to run terminal commands and then we have a knowledge store where you can store workflows and data for agents and humans on your team. So let's go back to the demo and see where we're at. And it looks like we've uh made some coding changes. So I'm going to go ahead and run a terminal command here to start um building these. Actually type Um, okay. So, I'm going to go ahead and build this. And then while this is building, I want to show you how I use warp to work on real code bases. And so, one of the key things here that I do now as a developer, if I'm coding by prompt, is I like to look at um exactly what the agent is doing as it does it. And so, actually, it looks like we have a compiler here. Let me see if the agent can go ahead and generate this fix for me. Try to fix it one more time. But the idea is that I like to see what the agent is doing and actually have a view where I can code review it as it goes. And so that's what you get on the right side here. I could actually add comments for the agent to make changes. I could revert the agents change if I wanted. And I could even edit it. And so we've brought in some of the editing features of the IDE to make this experience more smooth. So we're building here. Let me take a look at what it's done. If it did the right thing, which is possible. Live demos. There's always some risk here. Uh we're still waiting for Rust to build, but yeah, this looks generally right. Determine the referral URL for the QR code. Okay. Um and so yeah, while this is going, the other thing that we focus on in Warp is how do you it's really how do you have a very tight iteration loop as the agent works to make sure that what it builds can actually go all the way from prompt to production. So hoping we get there. So we're about to build and run. Let's take a look at what we've got. Just enter this in and see. It actually made the change for me, which is awesome. It's not quite perfect. There's a little bit of a fade on the link here, but it put the QR code at the top. Anyhow, thank you very much for the time here. This is how we use Warp to changes. You can check it out at warp. dev. And now I'm going to go ahead and now that we've shown you how to write code, you'll need to review it. And for that, I'm going to welcome Harzo Gil, founder and CEO of Code Rabbit. — Thank you. — All right. Thank you, Jack. — Yeah. So, as we saw in Zach's talk, generative AI like is the most compelling use case for that has been in terms of largest use case in token usage has been the code generation. We have seen like anywhere from 30 to 40% code and up to even 90% code in many startups is now being written by agent ready AI with all the way from like tab completion to coding agents in the terminal in the IDE and now even like background agents but all that volume of code being written in short period of time is now leading to like second order effects where we have seen code reviews are now becoming like a new bottleneck because the shipping velocity has not really changed in all these organizations right And the companies that we talk to, we hear from many senior developers that now they are like overwhelmed with a large number of review pull requests that they have to now review which are sitting in their backlog. And it's gotten to a point that it's becoming like really unsustainable for humans to really keep up with the agentic software development. And that's exactly where code rabbit comes in. So, Code Ravbit is the leading AI code review solution which provides like a critical trust layer that sits between your agentic software development and your production and it kind of provides like a central quality gate through which you uh each developer when they open a pull request it has to go through and it flags all kind of issues in these pull requests all the way from security issues to applying best practices and even enforcing custom policies. So, Code Rabbit is like pretty popular like it's used by over like several hundred thousand developers each day and we have reviewed like millions of pull requests in the last couple of years and found like millions of issues in those pull requests. So, today I'm going to like show you like a few open source repositories where we are being used. So, code review has a pretty massive footprint in the open source as well and we'll like see some examples uh showing code rabbit in action. So the first pull request this is a very recent pull request from a project called clerk which is a authentication platform and a user management platform for nextjs and javascript applications. So one things I want to point out is that unlike a lot of the generative AI products uh code rabbit is not primarily a chatbased interface. It's a background agent which has which pretty much kicks off as soon as the developers open up a request. So it's like a background agent with like kind of a zero activation

### [10:00](https://www.youtube.com/watch?v=-l0OqapibAA&t=600s) Segment 3 (10:00 - 15:00)

energy like you don't have to remember to use it and within a few clicks your entire organization gets onboarded into our system and in this pull request like code rabbit basically creates a sandbox environment in its cloud service clones the entire repository and does like a deep analysis on the code and once it done I mean it takes like 5 to 10 minutes it's part of your CI/CD flow and once it's done you kind of get to see a few things one is like from the code changes we are able to understand what the payload was and so that we can show like a walkthrough of the code changes. So if you're a human reviewer coming into a pull request, you get like a starting point, a bearing into what these changes are. Another nice thing we do is like we actually create a sequence diagram which I think a lot of our customers and the users love this feature where it kind of shows you the new logical flow that has been introduced by this pull request. But the main value of code ra code rabbit has been in the form of review comments. So we provide actionable inline review comments just like a human would which flags potential issues and even like providing refactor suggestions and sorted by criticality. So in this case we found an issue which was later addressed by the developer in some of in the in the in the iterative commit. And in this case what you will see that we do a couple of things like we also provide the developers like a way to accept one-click suggestion. So each time we flag an issue we also generate like a fix for that issue which you can just click a button in GitHub and and it will commit it for you. And the other cool thing we do which a lot of our users love is this agent to agent handoff. So we also like generate a prompt that you can take back to your coding agents like warp or cursor or codeex to actually go and fix the changes. So creating like a feedback loop. So using agentic software development then coding which is the inner loop and then during the review stage we find all the um issues which you can go back and fix using generative AI. So what makes code rabbit so great and what it does is the context we bring in. For example, if I show you under the review details, you will see that like we're bringing in lot of additional context all the way from understanding the code graph. Um, where we are able to pull in definitions from other files. For instance, in while reviewing tanstack. ts, we are like pulling in definitions from other files. And also like we are providing the users to provide their own custom instructions so they can tailor the reviews for their own organization. And these instructions could be about best practices or how they like want to code in certain styles and so on. But often like the context we bring in up front in front of the agent is not enough like the context windows are still small and the production code bases seemed like are pretty vast right so that's where like we do other thing which is like agentic ver verification so we run like an agentic explorer which kind of navigates the code like a human does and then it flags issues even in the parts of the codebase that have not changed. So in my second example which by the way is from the bun project. Bun as you know is like a high performance up andcoming JavaScript runtime. And this particular PR has been opened by Robo Bun which by the way is their Asiantic um software developer like it kind of opens end toend PRs. And in this PR if I scroll down one of the comments you will notice is a comment that uses the verification agent. So as you can see what codeabit does in this case it basically generates shell commands terminal commands to do code review. Essentially we are doing code generation to do code reviews. And in this case the agent is going in and running a grep. I don't know how many of you are familiar with that but it's a tool that allows you to pull in patterns from the codebase based on abstract syntax trees. So it actually wrote a pattern search to read the function definition because it was not in the provided context. And it's also running some rip crap queries on its own like it's also like finding some keywords which it can go and search in the entire code base or in other files. Now what this agent verification does is two things. First of all it helps code rabbit suppress a lot of noise because a lot of the comments we will generally generate as a first pass are usually surface level comments not with the whole picture of the codebase. The second thing it does is it finds issues which are like ripple effects in the entire codebase. Right? So that's been one of the things and the third thing I want to show is that it's Code Rabbit is like a very collaborative kind of a peer uh developer on the team. Um and in this example again from the bun project you would see that this is an open source contributor who made a code change which was flagged by code rabbit but because code rabbit does not have a lot of the tribal knowledge it's just making a guess that based on PR intent the code should look some should be slightly different and in this case the developer goes back and says even though the error doesn't say so the the comment is

### [15:00](https://www.youtube.com/watch?v=-l0OqapibAA&t=900s) Segment 4 (15:00 - 20:00)

incorrect like I mean so there is some confusion here so what codeabit does is it replies back to the developer but also creates a learning and learnings are more like long-term memory. So in a way like when code rabbit is provided any non-obvious knowledge or a tribal knowledge it kind of remember the facts in like a long-term memory which it uses in the future reviews to make the future reviews more tailored and more relevant because it's a central product. So your entire team can collaboratively like provide learnings to code rabbit and make it better over time. And as you can see as a follow- on message, the developer is now engaging code rabbit and it's very contextual chat to come up what with the right kind of error message um that should be shown for the user and that's where LLMs have been great with like they are they understand the UX the LLMs have been trained on best practices. So this becomes like a really indispensable tool uh to bring into your team. Now with that I would also like to highlight that code rabbit has been one of the biggest users of reasoning models in the world. Now code review is one of the use cases where we need a lot more reasoning than even code generation like maybe planning is other use case where you need heavy reasoning. Um but then the other one is code reviews and I want to like point out that GPT5 has been like a big game changer for our use case. So we have seen like almost a 70% jump in improvement than the previous generation of reasoning models or the models that we were using prior to GPT5 which has vastly made code rabbit much better in the last few months alone. And with that I would love to now next introduce Riley who's a founder of Charlie Labs who will come and talk about how to ship software in the era of AI. Hey, I'm Riley Thomas, the founder of Charlie Labs. Uh, Charlie is a fully autonomous TypeScript engineer that helps team ship code faster. He does everything you're familiar with from CLI and IDE based agents like writing code, answering questions, generating plans, and reviewing code. But what makes Charlie special is that he collaborates with your team directly in GitHub, Linear, and Slack. writes slopfree Typescript code fully autonomously and proactively finds and fixes bugs and tech. Um, Charlie is a full team member, not just a single agent that runs in isolation on a laptop. Let me show you what it looks like to work with Charlie. Okay. Um, so these are some linear issues that Charlie proactively created earlier today. Um, the bugs and tech debt. Uh, Charlie looked through our sentry and through the repo to find these and created the issues here. Um, let's get started and just assign these to him. Okay. Um, so now Charlie is going to start working on these issues. Um, and while he Oh, so he's already updating the status here. This is one of the nice things is Charlie is good at showing what he's doing. Um, so we'll look at one of these bugs here. Let's do this one. Um, so this came from Sentry. As you can tell, there's the trim stack trace. Um, and then Charlie was able to cross reference that with the source code that he also has access to um, to find the relevant code that caused this um, and correctly identify the root cause which is using JSON object without actually including JSON in the prompt. So this is a great catch. Um, the cool thing about this is it's then very easy for Charlie to fix because he's already effectively figured it out. Um, and you can see here he's posted a comment. He is already reviewing his changes, it looks like. Um, and he'll keep going with this. We'll come back to these in a minute. For now, we're going to look at how to make features with Charlie. Um, so Charlie makes a really good collaborator in Slack because he has access to your source code, GitHub, Linear, and then the intelligence of GBT with web search. um which makes him fantastic for brainstorming, coming up with specs, all of that. Um so in this case, I'll quickly show you the demo app that we're working on. The lack of dark mode is pretty harsh on this screen. Um so we can fix that. Uh looks like Adam. So the other cool thing is you can go multiple humans in Charlie in one thread all across GitHub, Linear, and Slack. So, we got a plan. Um, Adam specified he wants uh the default system

### [20:00](https://www.youtube.com/watch?v=-l0OqapibAA&t=1200s) Segment 5 (20:00 - 25:00)

colors or the system theme to be the default colors. Um, and then I asked Charlie to create an issue and assign it to himself. Um, I don't know about you guys, but I have never met a developer that likes writing linear issues. Um, this is such a great thing to just brainstorm and then when you're done say clean it up, which we can look at. So got pretty standard issue here. He's actually put the right label on. Um and then also the call out from Adam in the thread for the default theme. And then again, so this happened uh earlier today and Charlie already has a PR for this that we can look at. Um, one of the nice things about Charlie's PRs is that uh because he's running in a VM, he can run all of your tests and types and all of that. Um, which means CI normally passes when he opens PRs. And then for added confidence, he'll also show you which commands uh he ran to verify the work before opening the PR. Let's check and make sure it works. Look at that. We got dark mode. And that's system because it's dark. We can go light back to dark. Um, he also does a quick review of his own PRs. This is a great way to catch more issues. As you're all aware, different context makes him better at catching things. Like we run review within the agent loop, but then starting it again without the context from GitHub is another layer of safety, which in this case, it looks like he actually caught uh our custom instructions say to use named exports. It's using a default export. Um so we can actually get them to fix that. But I'm also going to just as a reminder for you guys, this is like a toggle list here. We'll change that to a dropdown. So then you can review Charlie's PR's. Um, you can comment whatever you want in here, but we're just going to say, uh, address your own feedback and change to it. We go request changes. Um, and then now couple minutes from now, Charlie will have a PR that changes that default export to a named export and uh adds a drop down. So, let's go back and see how the bugs are doing. Look at that. We have PRs for a large number of these um fully autonomously. And we can take a quick look and make sure that our CI is passing. Yep. All green all the way across. So, that's an overview of what you can do with a fully autonomous parallelized agents. Um, it's important to think the alternative to this would be wrangling a bunch of local agents, uh, creating branches manually, PRs manually, and if you want to get fancy and parallelize it, you would have the pleasure of working with git work trees. Um, and now we've showed you how to write, review, and ship code. Um, sometimes things break. Next up is Danny Grant, the founder of jam. dev, to talk about fixing bugs in production. — Sounds great. We started Jam thinking, how do we make software a lot faster to develop and add more fun to it? Dare I say vibes. If you're not one of the hundreds of thousands of people using Jam for bug reports, you should try it. It's awesome. But today, I want to talk about what's next for fixing software. We started thinking, what if you don't have to file a bug report? What if when the designer, PM, founder sees something in production that they want to fix, what if they don't actually have to ask anyone to fix it? What if they could just fix it themselves? We thought it would be amazing to turn the browser into an editor so that when you see something that could be a little bit better, you can make it happen. You can just edit your site right there, right on the page, and let an AI handle taking your edits and turning them into a PR. We call it Okay, well, we named it after the two words you're never going to have to hear again. I'm so excited to show you please Fix.

### [25:00](https://www.youtube.com/watch?v=-l0OqapibAA&t=1500s) Segment 6 (25:00 - 30:00)

I'm going to demo Please Fix on Jam's own website so you can see just how easy it is to edit a live site. One of the most common edits people always want to make is copy. So with please fix, here's how you do it. While you're looking at your own site, when you see copy you want to change, you just click on the please fix extension. Then you select the copy you want to change and you just change it right there. And when you're happy with it, you hit submit and please fix creates the PR for you. You can also edit designs. So like let's make the subtitle a little nicer. Let's make it a little bolder. and a little smaller. And we can change the colors. All of these tokens in the editor are pulled from your codebase, your design system, so everything stays consistent and really clean. We have a powerful editor in line, but you don't even have to use it if you don't want to. You can just ask please fix to make changes for you, like make this QR code 30% bigger, and please fix is going to make that change for you. Like any editor, you can select multiple elements and edit them all together. So, let's select a bunch of images and we can edit them in the editor or we can also ask please fix to do something with them. Should we add a CSS animation? Yeah, animate these image sections so they scale like from 50% when they appear on scroll. This is totally the type of stuff that a designer really hopes an engineer will be able to get to before a big launch, right? The designer puts like 20 like kind of requests into the engineer. they get prioritized by the PM and the engineer gets to let's say five. This way actually you can move without bottlenecks on specific people. You can move at the speed of your entire creative team. The developer doesn't have disruptions. Rather they just get to focus on the hardest problems building new features while everyone gets to contribute what they can. Let's see the animation it made. I'm going to scroll down. You ready? Oh, it's nice. Hey, you know what else would be really cool? You know, sometimes you design something in your Figma and then you need to apply it to your live site. Well, what if we could just tell Please Fix to do that for us. So, we've actually designed an FAQ section for this site. And let's have Please fix it. I'm just going to take a screenshot of it. And I'm going to ask please fix to add this FAQ section attached like an email below this and we'll say what below which section because I want it right below the last section of the site. There's a whole lot of GPT5 codecs below the scenes here and a lot of codecs to build this. They are super powerful models. Thanks, OpenAI. We actually tried to build this exact product five years ago when we first started the company. And because there weren't LLMs yet, we couldn't. It wasn't production ready. And so that's why I'm so excited to be able to show you this today and for this to be something that you can really use. Cool. We've got an FAQ section. I think it looks great. Once you're done editing your site, you just review the changes and you can submit a PR. So, this takes a little while. So, I have one ready for us to look at. This is what a PR looks like by Please Fix. It's really short, sweet, easy to skim, see it in a preview branch. Uh, when you look at the code changes that pulls from your design system and reuses whatever you're using. So, in our case, it detects that we're using Tailwinds and it reuses our Tailwinds classes. I'm so excited to see what you all are going to build with this. Go to jam. dev/pleasfix to sign up and let's build a whole lot more software together. This has been fun. Thanks y'all. Awesome. Thanks, Danny. Our team works with incredible founders from all over the world, but not many are crazy enough to get up on stage and live demo at one of the biggest developer events. So, let's give them another round of applause. Awesome job you all. All of the founders will be outside if you would like to meet them, learn more about what they're building and how you can use it at your own startups or building your own applications. Thank you all for joining our first ever live demo showcase at Devday and hope you have a great rest of your day. Thank you.

---
*Источник: https://ekstraktznaniy.ru/video/11227*