# Full Tutorial: The Most Underrated AI Agent for Coding and Product Work | Eno Reyes (Factory)

## Метаданные

- **Канал:** Peter Yang
- **YouTube:** https://www.youtube.com/watch?v=j7CaMx2c56M
- **Дата:** 15.02.2026
- **Длительность:** 35:32
- **Просмотры:** 12,911
- **Источник:** https://ekstraktznaniy.ru/video/9552

## Описание

Eno is the co-founder of Factory and the most AI-native founder I know. He gave me a live demo of how he builds apps with AI agents, but what really blew my mind was his product management skill that writes PRDs, prioritizes features, and much more. We also had some real talk about competing in the white hot AI coding space.

Eno and I talked about:

(00:00) What makes Droid different from other AI agents
(03:46) Live demo: Building a speed reading app from meeting notes
(08:11) The difference between spec mode and plan mode
(12:04) How real engineers use AI agents vs vibe coders
(16:33) Skills vs MCPs vs hooks: When to use each one
(19:02) Eno's PM skill that completely blew my mind 
(22:46) Why Factory hires product engineers vs. PMs
(27:09) How a 40-person team competes with Cursor and Anthropic

Thanks to our sponsors:
Linear: The AI agent platform for modern teams. https://linear.app/behind-the-craft
Granola: The AI meeting notes app that saves you hours. https://granola.ai/peter


## Транскрипт

### What makes Droid different from other AI agents []

This is probably one of my favorite skills that I have. And what this does is it's basically when I'm doing things like reviewing PRDS, product specs, working on design docs, discussing feature prioritization, that PRD has the R language. It has our ideas, our principles, and the structure of it ends up looking a lot more like the types of things that if you've been in the room at factory for a year, you would say instead of just what like Opus 4. 5 is randomly opining on. — This is amazing, dude. Maybe you can share this with me privately or something. You only hire product engineers. Are you gonna hire like a regular PM at some point? — I think what regular PM means has totally changed. Like we have an AE who's the number three Droid user at the company. He's in sales. We're about to publish some interesting work about how Droid basically has passed the threshold of what we call like self-improving. All right, welcome everyone. My guest today is co-founder of Factory, a popular AI coding agent called Joy that works with any terminal or ID. today will show us how great engineers actually work with AI agents but gave us a live demo and we'll also talk about uh the crazy competitive AI coding space and what actually has product market fit in that space. So welcome sir. — Hey thanks so much for having me. I'm really excited to be here and uh there's nothing more I love chatting about than software development agents. — All right cool. Yeah so maybe before getting into the demo you can you tell about factory and droid and kind of what makes it different from other AI coding tools? Yeah, totally. And you know, factory, we've been around for actually a surprising length. We've been we're about two and a half years old. And when we first started, you know, the world was not really comfortable with the concept of letting an agent just, you know, call it YOLO mode or whatever on your computer. Uh and so we said we really need to orient around building products that will work in enterprise environments and build products that made you know VPs of engineering of a 10,000 person organization feel comfortable. And so we set out uh on a long journey to build these AI systems. We call it Droid the sort of core agent. Our product doesn't stop at the terminal or the IDE or the web or desktop. Of course we have those surfaces. Uh, but we also provide tooling that helps you analyze your entire company's code bases to determine what's stopping agents from being successful. We give you ROI analytics. We give you enterprise controls. So, there's sort of a lot of layers that make us feel maybe like the full enterprise solution to software development agents. — Got it. Yeah. It's smart that you guys focus on enterprise from day one because I question the product market feel on the consumer side. So, smart. Yeah. — Yeah. Totally. I there there's a I think a lot of optionality that people have like there's a million different coding agents and some of them are hackable not. Uh I think the people who care about quality have found their way to us. Uh but if you're like cost optimizing or something else uh you know you may just go with a subsidized plan or an open tool. — Awesome. Let's build something live. What do you think we should build? — No, I I love this idea. And I was thinking that maybe we would uh start sharing my screen. And I was thinking maybe we could uh like granola and actually record some of our back and forth on maybe a prototype of some form, a web app. What do you think? Is there a specific direction you wanted to take this? — Um yeah, we can just build a simple web app and maybe you can use some like, you know, best practices of using Droid. You can show us how things work. Yeah. — Yeah, totally. Then maybe what I'd suggest is and I'm going to pop this open, hide this, share my screen, and show this granola transcript that I have right here. What we can do is we can build out an app that showcases like a simple fast reading application. You know, I don't know if you've seen this viral thing on Twitter where, you know, you have like a book and you upload a bunch of documents and then it lets you read really quickly, like speed readad basically. Sound like interesting?

### Live demo: Building a speed reading app from meeting notes [3:46]

— Yeah, that sounds good. Yeah. — Cool. So, typically what I recommend to folks is when you want to use Droid, you open up either the terminal or our IDE extension. Um, here I'm just going to use the terminal and I'm using Ghosty. Uh, I love how quick it is. And basically, you know, we have a very simple interface. Um, not very simple, but a fairly simple interface that lets you type in. Uh, if you've ever used a terminal based agent, you'll have all the bells and whistles. We support things like skills, MCP, uh, you know, hooks, etc. You can select your model. Um and I think one of the cooler parts about factory is that you know we support nearly every frontier model uh as well as uh different levels of what we call autonomy which is basically how much do you want to give the agent the ability to operate in its environment. Uh do you want it do you want to approve every action it takes? Do you want only readonly commands reversible or everything? I'm going to turn it on high autonomy here and I'm just going to paste this transcript. This is like the transcript from granola that we just had where I suggested we do this. This is something we do all the time. Um, let's build a prototype for this in this directory, please. — And I'm just going to paste it. And couple things that I think are interesting about Droid. Um, you're going to see it plan, read, list directories. Uh, you know, make this really simple for you to sort of see what is it actually doing at a high level as it works. Um, but I think when you actually go under the hood, I think that where Droid shines the most is on things like basically long running tasks when you want it to run for not just a minute or 10 minutes, but really like an hour. It's too hard to show in a quick podcast. But we've done a lot around things like compaction or compression and prompt caching to make the experience feel really nice. And dude, I just want to mention one thing like um the fact that I can just like I think I use tab or something to pick like allow all commands versus allow some commands like that has much better UI than like um you know like I love clock but like the default experience in clock ho where it asks you for permission for everything. It it sucks man like I don't have I don't like sitting around trying to grant it permissions you know. — Totally. I think that there's actually like a real like security and risk thing of if you give people two options like I have to approve everything manually or dangerously run YOLO mode. Um here you can see Droid is actually opening the browser for me autonomously. Um and it's jumped in and uh it's basically testing out uh you can see it's taking screenshots and QAing uh the work that it just did. So it's going to determine you know did I adequately test what the user is doing? I don't know you can see here uh — is that using playrite or is just like some native thing that you built? — This is using Chrome DevTools. Uh but uh by the time that this podcast airs we've actually made this native. So uh it the Droid for everybody will be able to you know browse interact um and see it's basically confirmed that it's done. Um and it gave me a little alert and I can iterate. So, uh, I think most people who've used a tool like this are familiar with this workflow, but I think that once you actually jump in, a lot of the nice quality of life things like the ability to create skills, manage your skills in one place, an MCP registry that contains most of the major tools that you'll use like linear, notion, etc. like one click away. Um, really just make for a much nicer experience when you're developing. Um, so if you want a strong multimodel harness, I think Droid is basically the like leading option there. — This episode is brought to you by Linear. When engineers use tools like cursor, clock code, and codecs, a lot of work happens invisibly. Someone can go from a bug report in Slack to a shipped fix without creating any record of what happened outside of the code editor. And that's fine for speed, but it makes coordination harder as you scale. Linear integrates with the very best agent coding tools directly like cursor and codeex. That way anyone can see what an agent is working on and who assigned them to the task. You get the speed of agents without losing visibility across the team. Product teams at OpenAI ramp and block are all using Linear to collaborate with AI agents. And I use LIR myself to run my creator business. So check it out at linear. app/agents. That's linear. app/agents. app/ aents. Now, back to our episode. It's typical best practice to like write like a little plan or spec first before you do this thing. But in this case, I guess like our spec is just a granola conversation.

### The difference between spec mode and plan mode [8:11]

conversation. — Exactly. And but what I recommend is we actually have something called spec mode. Um and maybe the nuance here is basically and you can do this by just hitting shift tab. Um, the nuance here is that when you're in spec mode, I'm gonna say, let's make this a more fully fleshed out product. What you're going to see is that in spec mode, a lot of agents call this planning mode. Uh, where you get a plan for what to do. Our view is that a plan is a little different from a spec. Like a spec is like what should be built and a plan is how you build it. We think the agent should figure out how. You shouldn't like be in plan mode. you should be in spec mode where you define basically here it's asking me questions about like what input sources should be able to use. I'm going to say all of the above. What reading enhancement features would you like? Maybe chunk mode. Um and local storage, you know, any additional features. I could type my own answer here and say let's definitely have a party mode button. So, I've answered all of its questions. Uh and it's going to propose a specification. Uh, and when it proposes the spec to me, I have a bunch of different options like I can, uh, choose to edit it. I can open this up. Um, you'll see that this is saved as an actual document. Um, and so if I choose to manually edit, it'll actually open VS Code for me so that I can jump in here and look through this spec, read through it, edit it, and after I've edited, I'm just going to delete party mode. — Let's go ahead. You'll see that Droid will pull that spec in, reread the changes I've made, and kick off a plan to get to go further. — Got it. And this is after it's already built the initial version, right? Or is like um so it's kind of get better. — Yeah, — exactly. So, we basically just speced out like a whole new plan of how to work. — Got it. Yeah, this is awesome. — So, as it's iterating, you're going to see it's changing stuff. So, obviously, it's not going to work. Uh, React's hot reload is obviously awesome because it's going to keep hot reloading. But the moment that it completes its work, you can see it's asking for permission as it operates. I'm actually going to shift it to high autonomy so it stops asking me permission. And I'm going to just let it cook. — Oh, so you can actually shift it like have while it's actually working. — Yeah, I can shift uh in and out of spec mode. I can shift the autonomy levels. I can actually change the model mid session. So if I want to start and plan in uh for example Opus but then execute with GPT 5. 2 um these are all settings that you can turn on or — if you just want to switch mid session you can. — Got it. I do think yeah I do think being able to pick the model is important like I guess I kind of I mean I kind of get comp access to a lot of this stuff. So I don't think about cost but like you know if you're running an enterprise like the cost really matters right because Opus is really is pretty expensive. you know, you don't want to run up for everything. — Yeah. — Yep. Totally. Uh and I think that there's also a lot of things that people are discovering now, which is uh for example, GPT 5. 2 codeex is extremely diligent. It's very good at validating its own work. Uh and it will run for a long period of time, but it doesn't have the same like sort of highlevel planning intelligence that you know, fairly subjectively, although we have some eval to back this up, Opus 4. 5 has. And so there's a great way to sort of get the best of both worlds in agnostic model agnostic harnesses because you can actually say look opus will plan and GPT 5. 2 will execute. Um and that combo actually outperforms either alone. Um so a lot of what we try to do is actually make decisions like these way easier for you um by setting sensible defaults giving you a really solid experience and of course the cost thing matters a lot for people. So being able to switch to a cheaper model or a more expensive model tends to be like a pretty pleasant experience. So do you have any uh like highle tips to like how would a real

### How real engineers use AI agents vs vibe coders [12:04]

engineer use this versus like a you know like a vibe coder right like you know — totally I think that probably one of the things that's most you know optimized for real engineering scenarios is Droid has a lot of uh both like system injections prompting as well as harness level modifications to really heavily encourage validation of its work. uh we use this word validation a lot but our view is that agents are you know fundamentally bottlenecked by the ability to validate their own work like Chrome DevTools is a great example of sort of QAing and validating that the change it made actually visibly it makes sense um — code has tons of these validators you have uh llinters unit tests type checkers I don't know if you can see that it's continuously building running dev uh you know linting type checking in this like flow right here uh the Droid is working. Um, we think that we basically have done this probably to a higher degree than most, which is a big, you know, benefit for the actual product experience that people have. Um, so here it's going to open this up. You can see it taking control. Um, we've added some of the things that we mentioned, the ability to add content, full screen, etc. — Yeah. So, I don't have to remember to do all this testing manually. You just do it for me each time I ask you to do something like build something new. — Exactly. like the Droid will actually take screenshots of your product. It will QA it for you. It'll click through. It'll list console messages like are there any errors that popped up in the console. Um this is a lot of stuff that we think, you know, as somebody who is in product or data science, um or even just someone who's not a front end or full stack engineer, if you're building prototypes or you're building, you know, straight up endto-end real work as a production engineer, obviously you can know these things and everyone knows it's good to do them. But when your agent is the one that sort of says, "No, I actually need to validate my work to move to the next step," the quality of the output is way higher. So I think a lot of people sort of when they say like Droid like subjectively feels really good, what they're actually pointing towards is this idea that we validate the work very rigorously. Um, and it doesn't really come at that much of a cost of spend or tokens because it's sort of the measure twice, cut once thing. A lot of agents are measuring once, cutting once, measuring again, cutting again. Uh, and for us, it's like just validate the work iteratively and you'll get a much higher result. — Got it, dude. Let's check out the app, man. Uh, so what does this thing actually do? — Yeah. So, so this is a speed reading app uh that basically lets you, you know, go through and I think the idea is that it helps you maintain comprehension as it works. I I've noticed that, uh, it's doing two-word chunks. Uh, so what I actually want to do is I wanna see if I can change it to one word chunk. And so the idea is you can sort of read this as it goes. You know, it — Got it. Got it. So it reads much faster than having like a huge paragraph. — Yeah. Exactly. So it's just sort of like a play app that you'd have. But I think that the thing that's sort of fun about this when you full screen is I don't know if you can tell, but this is actually like already stylized. We have like a public website um yeah where we've got a lot of content. One thing that I like is that Droid is really good at picking up your codebase's existing styling. So like this is our brand colors. These are our sort of like similar components to our actual design system. Um you know the modules have our borders, the font is ours. Um and I think that what a lot of people underestimate is that building stuff that's in your design system doing it well uh is actually fairly difficult. Um, and so if you want to have like vibecoded things that just zero to one a random, you know, code base, that's fine. Droid is fairly good at that. But when you have an existing codebase like our factory public web here and you want to make modifications to it, you want to build a new app, you want to keep consistency of your design system, Droid can do that quite well. — And I didn't have to like you didn't build like a skill or something like design system skill. It just does it by reading the code. — No. Yeah. Like if you look back, there's no skill being invoked. It totally could though. If you wanted to have a skill for your design system, you could. But I think that that's actually what's cool about Droid is that at the beginning it does this grounding step, right? Where it's actually reading through. It's looking at different layouts. It's looking at our CSS. It's looking at different pages and it's using that to sort of ground its UI. — All right, dude. Well, let me ask you this. I'm going to throw you a curveball. So, there is like all

### Skills vs MCPs vs hooks: When to use each one [16:33]

kinds of crazy terms, right? There's like skills, there's hooks, there's sub aents. Yeah, this is just like this is like for someone who's new, this is super confusing, man. Like when do you actually use all the other stuff? Can or can you just like go back and forth with AI and just build something like — Yeah, stuff. — Yeah, I think that this is such a hotly uh a hot contested debate. We have full support of all of them, right? So sub agents, skills, mcp hooks, slashcomands uh and like a global config that lets you manage all this stuff. Um, yeah, what we've seen is that — clearly skills in MCP have by far the highest usage. Um, and I think that this answer changes based on who you are. If you're a solo developer, I think that there's a lot of opportunity for you to like sort of build your own custom workflow with these things. My personal opinion is that we get a lot of mileage by just having a couple of skills that matter for things like data engineering, for things like building repeatable components and integrations. Uh, and I have a skill and a lot of the people on our team have a skill for like writing and like language that matches their voice when they want to use it to generate content. In terms of MCP, there are a ton of them. Uh, and obviously we have a registry for things like Linear, Notion, Axiom, Data Dog, Century, etc. Um my view is that skills might be just a better way to manage like integrations context. Um and so if you can get a skill for a given capability that might be better than MCP. Um and hooks I think are really good if you are the type of person that loves to make their tool like super custom. But from enterprises, what we've seen is that enterprises will have like a couple of people focus in on making skills, MCPs, tools for their whole organization or for big uh you know teams in their org. Um and because factory is the only offering that lets you actually from an enterprise perspective manage who has what customizations from the user team and enterprise level. Um, I think that a lot of power users end up be getting converted over to factory because it's just easy to get everyone in your 10,000 person company outfitted with a skill that meaningfully changes their dev productivity on a daily basis. — Okay. So, there's like a permission system or something. — Yeah. Permissions and also just shared uh access to a ton of different skills, tools, MCP at the enterprise level. — Can you and you can tell me no on this, but can you actually show us a skill like can you show me your writing skill or what whatever skill you want to show? Yeah, totally.

### Eno's PM skill that completely blew my mind [19:02]

— Like I have uh a couple here that are live on my like prod like uh change log uh code canvas product management writing factory blog posts. So like if I were to go uh and actually — let's No, let's look at the product management one because there's a bunch of PMs. — Yeah, of course. Um yeah, can you open my product management skill uh file in VS Code? I could probably do that myself, but I use Droid for everything. So, it's much easier to just say to Droid like, "Open that file, please. " Um, — and so there we go. Um, — this is uh I actually think that a lot like this is probably one of my favorite skills that I have. — And what this does is it's basically when I'm doing things like reviewing PRDS, product specs, working on design docs, discussing feature prioritization. I'll zoom in so it's easier to read. And what I've done is we have a bunch of source of truth documents. So we have our product principles. We have a core value prop what we call the 11star experience which is taken from Airbnb. This is an awesome framework for thinking about you know basically Brian Chesy was like you know a fivestar Airbnb experience you know they roll out the red carpet it's great you get the Airbnb they give you the keys they give you a bunch of cool things to do. — Yeah — that's the fivestar experience. What's six-star? What's eight? What's 11? Right. And 11 is like Elon Musk personally takes you on the rocket ship yacht and you go to Mars and so what this framework does is it lets you say where are we today and what is the baseline expectation of an amazing experience in your product that is the bar. Now what comes after that? What comes when you break that bar? And what's cool about factories is in the last two and a half years, we have slowly moved like our original 11star experience or at least the sevenstar that we had two years ago is now our fivestar experience. So like it's just the baseline expectation of what wasn't even possible in the f like maybe at some point in the future this will work is now what we expect the average user to have in our product. Um so it's a really cool framework. Yeah. So, you know, anyway, tons of docs, product positioning, how we build, prioritization frameworks, templates. Um, and what you do is you basically pull all these notion docs together. Factory has a native notion integration, so you don't need the MCP. Uh, you just integrate it for your whole company and it handles permissions. Um, so it'll pull all that data and then it'll use that for things like PRD reviews, guiding the language, um, and has a couple of examples. — But, bro, like, um, which one like I guess it calls different not because this is like probably a lot of notion dogs, right? So does it call different notion docs based on which uh what you want to do like build a PRD or — Yeah, exactly. So basically like the what this is you can think of it as like a map almost of our most important documents and these are like shared sources of truth, right? Uh and I would if our company was purely people in GitHub, I would probably put these in markdown — uh in GitHub, but we have folks that use notion like our AEES, our ops, our uh you know most of all of our product team is actually their product engineers. So they're all engineers. Uh but we pull all this stuff together and then what happens is based on what you're working on. So if I say like I would like to write a PRD about this new thing, that PRD has the R language. It has our ideas, our principles. Um and the structure of it ends up looking a lot more like the types of things that if you've been in the room at factory for a year, you would say instead of just what like Opus 4. 5 is randomly opining on. — This is amazing, dude. Dude, you got maybe you can share this with me privately or something. I can copy this thing so I can make — Oh, yeah, for sure. I' I'd be happy to. We can maybe attach it to the vid and like share it with the anyone who's listening. — Yeah, that would be amazing. I've always wanted to build a product management skill and um you mentioned one thing that's a little bit innocuous, but I

### Why Factory hires product engineers vs. PMs [22:46]

think has a big impact. You mentioned that you only hire product engineers. So, are you going to hire like a regular PM at some point or like you want people with both engineering and PM? — Yeah. Well, I think it's funny because like I think what regular PM means has totally changed. Uh so my view here is that what we and it's compounded by the fact that what we build is a software development agent. So even our AEES are like we have an AE who's the number three Droid user at the company. So he's in sales, right? Like but he is still the number three user of Droid. He does everything from Droid. He does customer research. He puts together skills for uh analyzing customer usage data to determine how he can help uh provide better experiences for his customers. He uses it to track his deal flow. He has Salesforce connectors. So everything in his life is operated by Droid. Our view is that I think that a lot of people underestimate this aspect of software development agents. Uh and I think it's because of maybe like the terminal dominant UI. uh but you know our view at factory is that software development agents are basically the next generation of general AI systems and so it's no secret software development agents are advancing basically everybody's capabilities not just software engineers but u you know ask anyone at cursor anthropic or openai they'll all admit that most people at the company are using their software development agent uh for productivity gains um and so it's quite clear that for us what it means to be really any role has changed a lot. If you have no experience as a software engineer, you can still be in product at factory. That's totally fine. But I think you definitely need to have a drive most of your workflows with AI um if you want to work in any role at factory. — Dude, I I think and your AE probably doesn't know how to read like code syntax and stuff like that, right? — He's a bit of an outlier, so he definitely does. However, most of our AES are definitely not they're not in VS Code. They're not trying to live and operate their life via the terminal, but they still use Droid — because I think it's just like a higher level of abstraction like it's almost like engineering like you need to understand some of the technical stuff, but like you're basically like trying to talk and plan this stuff out in English, right? It's not like you got to know about for loops and while you know all this kind of crap any anymore, you know? — Yeah, 100%. Yeah. I think that it it's funny because we were just talking about this internally. I think people think of the terminal as sort of this destination or place because they're used to thinking of the IDE as this destination. Um, and what I mean by that is like the IDE contains this encompassing view of all the information that's helpful when you're coding, but that also builds walls around the IDE as a concept, right? It may be easier for a software developer to operate inside those walls, but it really sucks you in. And you know, you open the IDE full screen, it's got all these crazy screens, debuggers, it's got 50 buttons. Um, and this complexity is definitely intentional because it's a power tool, but it changes how you interact with it. Um, and our view is that, you know, the terminal or a native app for agents uh is not necessarily a destination in and of itself. It's not your full screen. It's more of an overlay. It's this thing that lives on top of the rest of your computer. Uh, and — sometimes, you know, something that you keep open all the time, something that has access to the file system, the apps, the desktop, you know, I think that this is a better indicator of where the future is going. Like these software development agents are just general computer use agents. Uh, and so most people who work on computers could benefit from having a little overlay in the upper leftand corner of their computer that they can talk to and basically ask to do nearly any task for them and it should just work. — Yeah. Like uh Yeah. I mean like all white collar work is done on computers and like you know code is how computers work so it kind of make sense. — Yeah. Like software is sort of the physics of AI agents. Uh and so it definitely behooves them to be good at manipulating their own physics their own world. Uh and I think that's also why software development agents have moved way faster than other fields because they're also made of software. So this the self-arning bootstrapping is is very clear. We're about to publish some interesting work about how Droid basically passed the threshold of what we call like self-improving. — Wow. So, let's talk about something

### How a 40-person team competes with Cursor and Anthropic [27:09]

which is um you're a pretty small team, right? How many people are in the company? — We're 40. — Okay. And um like you're competing against like you know clock and curs and these like super wellunded companies and dude I'm like super impressed that you guys are like number one on terminal bench you know this might be a much smaller team. So how do you do it man? Like how any secrets or — No totally. I mean I think that there is a funny thing of like all the resources in the world as we all know uh cannot necessarily purchase a product experience that's fully crafted for your ICP. Um I think that there's two angles here that are important. Like the first is that the cursor team, the anthropic team, the OpenAI team, I mean incredible. These are I we know we work with them all the time. They're all awesome. Like every time I've met all these folks, they're total class. So One thing is you have to hold two things in your head like there is a huge well-funded very smart group of people you know also building in this space. Um, but at the same time are I think that there's just so much to be explored in AI for software development that effectively just opening Twitter, reading a couple people's workflows, you'll quickly realize the variance in what a good AI software development agent or a good workflow is so high that there's just so much to build. And so for us, there's two things that really matter. The first is just a relentless focus on customer and ICP. So there are features in Droid that make no sense for a solo developer. Uh things like the enterprise hierarchal controls, some of how OTEL works, a lot of like you can actually run Droid in the most airgapped environment like a you could run it in a submarine if you wanted to as long as you had a GPU. Uh I think that level of control, flexibility, customization doesn't really sell well to a individual developer. However, I think this is what has made us more capable in general is that because we've built all these things, we have gotten access to customers that have incredibly difficult and very sophisticated software problems to solve. So, we get to basically hill climb not only on public benchmarks and in fact we actually don't really hill climb on public benchmarks like our performance on terminal bench is not because of terminal bench. it's because of a separate data set built of more realistic enterprise customer data. Um, and so that I think has been a huge boon for us is basically being able to work on much harder software problems. And if you solve those, a lot of the they're not necessarily simpler, but maybe more straightforward problems like full stack development, you know, uh, 0 to one, etc. sort of come naturally. — Yeah. Like the harder software problems is like what like refactoring and like you know these gnarly legacy code bases, right? It's like all the that engineers don't want to do. Is that what — Exactly. Like I think that there is just so much crap involved in software and no one wants to be the guy or the girl like refactoring a cobalt legacy codebase that's like 15 years old. Everyone who's touched it is either gone or, you know, not working on it anymore. And droids just do that stuff pretty well. — Yeah, dude. Because like dude I think the best part about you know Joy and some of these other AI agents is like it's very detail oriented in like just reading and understanding the codebase because if you onboard like a new human developer it's going to take them like a long time man to figure out what the hell's going on with the code base especially if it's like a mess you know. — Yeah and and I think that that's one of the coolest things we've seen. So we deployed there's a customer that we deployed we went zero to 10,000 people in like a couple months basically. Uh and one of the ways that we did this was just by enabling not just software engineers but really everybody who wanted access uh and saying to them look like if you are someone who is anywhere near the software process open this tool up and just start asking it some questions that you've been wondering about the world that you operate in like there's so many people who because it's very costly time-wise to like learn coding or learn big aspects of software but their work is so consequential to the delivery of software ops, product, QA, uh DevOps, like uh you know, data science. They sort of know how some of the software engineering stuff works or maybe they know pretty well. They just haven't invested time in learning. Droids just make this so much easier. So, it does really feel like democratizing access to what used to be a very complex and hard to understand topic. Uh yeah, — you can now just get it digested for pretty cheap. Yeah, I think enterprises should just give everyone like access to the codebase. Like maybe not right access, but at least like read access to just figure out what the hell's going on because then you can ask a bunch of questions to join these other agents instead of bothering other people. — Yeah, I mean a bunch of companies are going to pay a huge cost to design decisions like we're not a monor repo or we're limiting codebase access to only these persona. Like this stuff is going to not scale well. It won't age well into the AI era. Uh, so it's a lot to think about if you're a engineering leader — and probably like a lot of what you because you're focused on enterprise which I totally think is the right thing to do but like probably a lot of what you're doing is just like educating because like you know if you're if you're trying to sell joy to anthropic or something but like that makes sense. They don't understand what's going on but like a lot of these enterprises is like you know a centur whatever they don't know any of the right you have to train them how the stuff actually works. — Yeah. And and that's that's a big part of also when you're building a product for enterprise, I think you have to be thinking about not just does the product work and is the user journey very clear, but also how does a user become a power user like what is the activation and then what is the like the basically the secondary activation that happens post I'm using the tool and now I'm really for us we've seen there's like a usage based activation of like I'm sending messages pretty frequently so I clearly like the product. And then there's a customization activation which is I've uncovered what skills and hooks and MCP and tools those power users in the enterprise very quickly become evangelists. They're sharing it with everybody. They're so excited to use Droid. They start bringing more and more of their work into Droid. Uh and I think that that's the most fun is seeing enterprises that most would say are like quote unquote legacy — doing cooler stuff than what you see on Twitter. uh which is fun. — Yeah. It's kind of like the people using cloud code to run their life but except for enterprise, right? Like that. — Yeah. Exactly. And and the enterprise at the very least it's actually like better suited for this sort of stuff. Like it's still a huge pain to connect your Gmail to cloud code or to Droid. Uh but you can actually like pretty easily connect like Outlook and uh and like Excel and all this other stuff to Droid. So it's uh yeah it's a it's a lot easier to operate your like work OS from uh from Droid. — Awesome dude. So um so people are excited about Droid and you know building a product management skill and like so where can people so Droid is free to use right? Like where do people go? — Yeah just go to factory. ai. We've got the CLI link right there. Uh but if you sign up we can give you up to uh a bunch of free usage uh to get started. some really exciting things depending on when this airs uh of free usage that I think a lot of people are going to be excited about. Um and then we have a bunch of plans for all sorts of options. Uh so really easy to get started. Just one line and you're in. — All right, dude. Well, I mean I have thoughts about a bunch of VIP coding companies out there, but I think you guys like it all comes down to focus, man. And I'm like super impressed by the progress that you've made. So yeah, I definitely highly encourage everyone to give Jordan a try and also you have some really great talks out there. So like should people find you on Twitter or where can people find you? — Yeah, you can just uh it's enory eno ree. Uh and you can you know see all my Twitter my Twitter escapades. — We got we gota speed read you got a speed read thing with the Twitter API so you can just read all the you can read all the rage bait tweets. — Exactly. That's a great call that just a daily dose of very fast rage. — Yeah. And they just become a very demented person. But anyway, yeah. Cool, man. All right, dude. Stay in touch, man. Yeah. — Yeah. Thanks so much. Bye.
