# Beyond the Hype: 28x Your Engineering Velocity with AI

## Метаданные

- **Канал:** InfoQ
- **YouTube:** https://www.youtube.com/watch?v=uhYwsbcOaBQ
- **Дата:** 11.05.2026
- **Длительность:** 40:55
- **Просмотры:** 1,912
- **Источник:** https://ekstraktznaniy.ru/video/51772

## Описание

The "10x Developer" is dead - long live the 28x Engineering Leader. 

While many senior engineers are skeptical of AI hype, the reality isn't just about writing code faster; it's about fundamentally re-evaluating the software lifecycle.

In this InfoQ video, Sepehr Khosravi (Coinbase) explores why AI sentiment is dropping despite tools being at an all-time high, and how top-tier organizations are moving past "simple autocompletion" to "agentic workflows." He breaks down the exact tech stack (Cursor vs. Claude Code), the power of Model Context Protocol (MCP), and the mindset shift required to achieve 28x productivity gains - inspired by real-world lessons from the Databricks CEO.

⏱️ Video Timestamps (For Navigation)
0:00 – Is AI Productivity Overhyped? (The 2025 Survey) 
2:15 – The Reality of Net Productivity Gains (15-20%) 
4:10 – IDE Layer vs. CLI Tools: Which to Choose? 
6:30 – 14 Pro Tips for Cursor & Composer 
11:15 – Claude Code: The King of Complex Feature Research 
14:40 – Mast

## Транскрипт

### Is AI Productivity Overhyped? (The 2025 Survey) []

Thank you. Thank you, Jasmine, for the intro. Before we go into maximizing developer productivity using AI today, I think it would make sense for us to get a gauge of where we are today with AI. So, if everybody could get out their phones and go ahead and scan this QR code, it's going to be a little survey we're going to take to see how far into AI productivity we are as a group right here today. It might ask you for your name. Feel free to skip that part. Uh, it's going to be anonymous regardless. And then after you answer the first question, make sure you do not close out of the screen. We have three total questions that we're going to ask and it's all going to be on that same link. All right. Going once, going twice, moving on. I I'll leave Jas in a sec, too. Okay, there we go. Okay, first question. What level of AI assisted coding best describes you? And we're going to see a live update coming in over here. You can say none if you use little to no AI, beginner if you occasionally use some chat, GPT, or cloud. Intermediate if you're regularly using AI for your co-pilot, advanced if AI is your default and you mainly are reviewing and iterating on the code it produces. And then expert if you're building full-on AI workflows, agents and tool integrations. Awesome. I see the responses rolling in. It seems that most of us here are intermediate around 60% which is pretty good. Only 2% are at none which is awesome to hear. And moving forward to the next question. What percentage of your daily coding is generated by AI would you estimate today? I'm seeing 25% 50% 75%. Okay, so this one is lower than I expected u from the other one, but around 50% of you seem to be having zero to 25% of your code generated with AI. And then last question, this one's pretty open-ended. What developer productivity tools do you use the most? You can have up to five responses here. Just go ahead and put whatever tools you're using the most and we'll see like

### The Reality of Net Productivity Gains (15-20%) [2:15]

a word bank start to pop up here. Seeing a lot of co-pilot. The bigger the word gets, that means the more people wrote that one in. I'm seeing co-pilot, a lot of chat, GPT, Claude, Cursor, GitHub, Cider, Astra. Somebody just said code. Okay, that works. I'll give a couple more. Not as wide of a variety as I would have thought. Seems like most of us are on co-pilot and uh it makes sense. So I want to compare where we are in the room today to the latest survey. Stack Overflow did a survey on AI tools in the development process. And what they found out is that roughly one in three engineers use code less than once a month. Use AI to month, which is a lot higher than I expected. That's one in every three people that you might see here aren't using code at all, but I think from what we saw, we're a little bit over that. Also, there might be some bias because it is a Stack Overflow survey and people who are using Stack Overflow may be using AI a little bit less, but this is the best data that we had out there. Now, what's also really interesting is that although AI usage has continuously been going up over the past 3 years, sentiment has actually decreased over this past 2025. In 2023, 2024, sentiment was above 70%. And this year in 2025, we've come down to just 60%. Which is interesting because as we know, the AI tools are the best they've ever been in 2025, right? So, why is that? I think a lot of it is due to headlines like this and a lot of CEOs coming out and making really bold claims about AI and like Zuckerberg went on to the Joe Rogan podcast and talked about how AI will replace mid-level engineers soon maybe by the end of 2025 and then because of that I think naturally um there's this hype that gets created on

### IDE Layer vs. CLI Tools: Which to Choose? [4:10]

one end and then there's like a counter reaction to that where people start to say no this isn't the case AI coding is a little bit overhyped it's not all it was planned to be so the pendulum swings this way so and I think that's where we're kind of at right now with a lot of people being a little bit apprehensive with using AI to code. But the reality, like most things, usually lies somewhere in the middle. So for the agenda for today, we're going to talk a little bit about the current state of developer productivity. What realistic gains can you expect to make? Then we'll talk a little bit about choosing your AI co-pilot. And then finally go into two of my recommended tools, which are Cursor and Claude. and then I'll share some lessons that I learned from the data brick CEO last week followed by a Q& A at the end. All right, let's get into it. Developer productivity. First off, this is another long-term research that was done on over a 100,000 employees by Stanford and they wanted to try to see how much productivity gains is actually being made in code that is written. So I won't get into the exact methodology that they had, but they didn't just measure commit numbers or lines of code changed. They had a panel review the code and try to actually understand what level of productivity was gained from that code, not just the number or the quantity. And what they found out is about 30 to 40% more code being generated by using AI. However, they also realized that 15 to 25% of that was code that ends up getting reworked whether it has bugs or get ends up getting deleted later on. So they estimated that net overall software engineer productivity gains about 15 to 20% productivity gains from AI. Now I think that number can be even higher if you learn how to use these tools but I think at a bare minimum that is something that you can expect to gain from these tools. Now going to look into the tools I think there's really three categories of tools. The first category I would say is the all-in-one friend non-developer friendly tools where anybody can use and I think with this category we really do have 100x in productivity. This is where I spend a lot of my time teaching at UC Berkeley. I also run a nonprofit for kids to teach them how to use AI tools. We see people have zero experience come into these courses, build a business and start making thousands of dollars off a software they wrote, right? Is it the most insane software? No. But it does its job and it makes them money. Same with kids. We have 11 year olds coming in and building apps that their friends

### 14 Pro Tips for Cursor & Composer [6:30]

start using and they gain a little bit of real users on. So there really is a 100x productivity gain here where things you just weren't able to do before people are now able to do which also may attribute to a little bit of the AI overhype whereas we don't see the same 100x gains for developers but we still do see the gains. And I would split our developer tools into two segments. One is the IDE layer where it's found tools built on top of foundational LLM models things like copilot which most of you are using cursor wind intelligj clin and most recently Google anti-gravity that came out so I had to update my slide right there and then on the other end we also have a tier of tools that are terminalbased CLI these are typically made by the foundational models themselves so cloud chacht google ki they make their own models in this CLI format. And I'm going to go over a lot of stuff that maybe a lot of you might already know since we are at the intermediate level, but hopefully all of you can move up one step here and by the end of it, pick up at least one CLI tool and one IDE tool if you're not already. So getting into choosing your AI co-pilot, as we talked about, tons of tools. Most people are using Visual Studio Code from that same Stack Overflow survey. about 75%. But what was really interesting is they went one level deeper and they asked those people who are using Visual Studio Code, what tool do you want to work with in the future coming up next? And the top responses were Cloud Code, Cursor, Intelligj, Ivia, and Neovim. Um, but yeah, so the two that we'll go over today are Cloud Code and Cursor. Not because they were chosen by the survey, but because I also think those are probably some of the top tools out there right now. So, we're going to do a speedrun of cursor top 10 cursor tips so that if you've never used cursor before after this session you can download it and be a pro. Starting off with tip number one tab. I think 2% of the people in here said that they use zero coding and for those people I would really recommend you start with this tab. Cursor has built its own specialized custom model on this and it's really good. A lot of times you will type 10 to 20 lines of code by just hitting tab, not lifting a finger. Um, and it gets suggestions based on your recent changes, your linting and accepted edits that you make. So, it's really, really great. If you hate AI, just download that, let it show you what it's going to generate and try it out from there. Two is the cursor agent. I'm sure a lot of you have seen this and use this. What's really great about this is you can choose what model you want to use with the cursor agent. So, you can try Gemini, Chat, GPT, etc. And what I really love about this agent is all of the tooling that it comes with. It can read different files. It can uh search the web. It can um apply stuff to your terminal and have MCPs. So all of these tools is what really makes its agent so great. And a new recent feature that they launched which is also really awesome is this multi- aent mode where now you can type in one prompt and it will generate three, four, however many different occurrences you want to the same prompt response. And actually I went ahead and did this for a couple of the most popular models to see what they would generate. So, first up, I have Composer. Some of you might not have heard of Composer before, Composer is an LLM that Cursor made themselves. While it's not as good in code quality maybe as some of the other top tier models, what it really specializes in is speed. And a lot of changes you're making on Cursor, it's just simple changes where you don't need that smart of an AI. And this really, really helps. So, this generated this output. I asked them all to generate a landing page of a MacBook M5 Pro being released. So composer did this in 17 seconds. In comparison, cloud sonnet took about a minute and this is what it generated. And then finally, chachi pd codex took about 2 minutes and this is what it generated. Now this is a small sample size. It's one prompt. So this is not like a full study, but just to give you a general idea of what these prompts might look like and how much time it will take. And if you put them all by side by side, this kind of how it looks. You can take your pick at what you like the most. I honestly might like composer the most out of all of them. Uh, and then just for fun, since Gemini 3 came out yesterday, I tested that one as well, and this is what it generated in 34 seconds, which might be the best design, but we'll have to do more testing to see how good it really is. All right, tip number four is shift tab. By default, your agent mode come is in agent mode, but if you hit shift tab, you can switch it to ask or plan mode. And these are really helpful as well. If you're trying to just understand your codebase and not make any changes to it or maybe just use AI as a thought partner, you turn on ask mode, you start chatting with it. You

### Claude Code: The King of Complex Feature Research [11:15]

can even have multi- aent chat mode. So you're having different uh experiences like Gemini is giving you some advice. Catch Seeing what different ones say. And then you can go ahead and also plan uh mode where once you know what you want to do, you type it in and cursor will generate a plan for you. And this will be a readme file with all of the steps that it's going to take. And it's going to implement every step, test it, and keep going forward. But before it goes into it, it's going to give you a chance to review its plan. And this is really good for these really complex tasks that you're going into. Um, and one more thing that I really like with all of these is when you integrate it with multi- aent, I love when a new model comes out, shadowing that model with the previous one that I was using. So for example, right now I'm liking composer mixed with clot a lot, right? But Gemini came out and I'm wondering is it better? So what I'll do is I'll have all of my prompts generating twice. Once with my regular LLM and one time with Gemini and I'll try it out for a week, see which one I like better and switch over from there. So I think that's a really good way of testing every time these new LLMs come out which one you like the best. Cool. And uh for the plan mode, you can see this is kind of what it's going to generate. It's going to generate a checklist and you'll see in live time as it goes through it, checks off everything that it completed. Um, and again, really good for complex features. Tip number five is turn on cursor sound. This one might be underrated. A lot of people don't know about this one, but the biggest pain point of producing code with these LLMs is the wait time. You'll put in a prompt, you got to wait 2 minutes, and then you forget about it. You come back. In fact, it's so much of a problem that YC recently funded this company called Chad Labs where they have a brain rot IDE and it will let you play video games and watch Tik Toks while you're waiting for your code to generate. This company raised a ton of money for this. So, you can tell like it really is a problem. But don't I wouldn't recommend this as my go-to ID. Just turn on cursor sound instead. Tip number six is custom commands. So, another thing that's awesome about cursor is commands that are repeatable, you're using a lot, you can go ahead and make in custom readme files. Uh, for example, if you want to create a PR and you have a certain format you're always creating your PRs in, you can create a readme that specifies that. And then instead of having to tell cursor the same thing every time about how you want your PR format to look like, you just type in slash, you put in a command, and cursor has the context on that. Similar to commands is rules. Rules are a little bit different. They are they're not markdown files in C cursor. They're um MDC files which are a little bit different. What this allows you to do is give a description of each rule and also give it a rule type. So you can choose if you want your rule type to always apply. If you click on this, every shot you have, that rule will apply to it. So this would be good for things like if you're asking your code to not generate comments. You don't like your LLM generating comments, you will put this. So always it knows hey don't generate any comments for me. Then we have the intelligent apply when the agent itself decides when it should apply it. So based on the description that you give to the rule the agent will go and read that and say hey for this task is this a good rule to use or not. Sometimes it works sometimes it doesn't work the best. Um the other one is apply to specific files. So you can put it in a specific folder. When certain files are being touched that's when that rule is going to activate. And finally you can

### Mastering MCPs (Model Context Protocol) [14:40]

apply it manually. And that one basically just becomes commands. Uh it's almost the same thing. If you're applying rules manually or creating commands, you'll have to type in at instead of a slash and tell it, hey, this is the context I want you to add uh to this prompt right here. And another thing that's really awesome about rules is they have project level rules, user level rules, and team level rules. Uh so I highly recommend sharing these with your teams. the ones that are very specific to you, you keep it on your local, but there's other ones that where, for example, creating PRs in the same format. It's great to share with your team so you guys aren't all creating the same rules over and over again. And what's also really great is this agents. m MD format, which is basically the readme for agents that has become uh a convention for a lot of these coding tools. uh tools such as Codeex, Cursor, Gemini CLI, um Copilot, a lot of these are adopting this agents MD. And what this allows is these rules that you create will work across all of these different agents. Instead of each of them having a different format, then you're having to copy the same rule into different uh formats and whatnot. The sad part is cloud code doesn't support this yet, but hopefully uh sometime soon. If anybody in the audience is from cloud, please. Um, and then tip number seven is just cursor rules. Uh, good rules. It's similar to prompt engineering, but the main thing is the context that you give it and just instructing it how you would like your regular documentation, giving it all the details. Just keep it short under 500 lines. If it's something that's bigger than that, split it. You can nest rules. You have one, you can have one rule that references another rule within it. Um, so go ahead and do that. Give it concrete examples and avoid anything vague. Some example of cursor rules. Um, one could be like a refresh MD where you have some specification where if a bug persists, you give the AI some specification to search all over the codebase and try to dig a little deeper into each area. And sometimes this really helps uh when you're stuck on a bug. You can do a no comments MD where you can say, "Hey, don't add any extra comments. I I don't like the comments. " Or you can do a PRD MD uh where you just want to generate a PRD. You have the format that PRDs are generated at your company. uh you put that in there and a lot of the time that cursor saves is actually for things that aren't even coding like this. Tip number eight, MCPS. I'm sure you guys have all heard of MCPS. These are great. Um you can gain so much more functionality when you take the time to set these ups. And I think this is where you make a lot of that like expert level gain when you set up these NPCs. I highly recommend taking the time to try to set this up. The one caveat being that there is a maximum you can have 80 tools maximum on cursor even that I wouldn't recommend having 80 tools. You really start to see the models deteriorate when you give it too much context and it has like uh trouble figuring out which tool it wants to use. Uh so don't add too many or if you do add a lot make sure you turn them off whenever you don't need them. The top MCPS that I would recommend number one is document store. I think this is the biggest one by far. This helps so much. there's so many gaps in the code that is written in documentation that AI won't understand and only you know but once you hook up either confluence Google docs whatever you have your uh documentation stored on this is a really big unlock the second one is version control uh this is basically just GitHub which is obviously great to have third is any project management tool whatever you you're using linear Asana Jira this helps you take you can automatically grab the tickets have it solve it or you can create tickets saves a lot of time as Another one that's helpful is having any sort of database MCP and preferably having it set to read only so you don't end up wiping out your database by

### Lessons from Databricks CEO Ali Ghodsi [18:22]

accident. Uh but things like Snowflake, Superbase, if you need to query something, understand something from your database real quickly, this is also very helpful. And any observability tools you have as well, data dog, Prometheus, whatever you have uh when you're trying to debug, having access to these logs really helps the LLM as well. And other thing is manage your context window. I think a lot of times when I see people saying this AI isn't working well for me is it because they open one cursor agent chat they start typing in it they type a little bit next day they come type the same thing in it now this agent has so much context in it and it just starts to deteriorate so whenever you're switching tasks make sure to open a new agent so it has fully fresh context because it really does have a big impact on it um and yeah I would say like prompt engineering is important but much more important than that is just context engineering like what context you're giving your LLM matters a lot more than the specific way you're formatting u what you're saying to your LLM. Tip number 10, cursor checkpoints. Uh it's pretty simple one. We already have a control gip control, but sometimes you have a chat that's doing really well, then you give it the wrong prompt and it like completely sidetracks into the wrong direction. Just good to know that you can restore back to a previous point in the chat and continue from there. And I know I said top 10 tips. It's just because top 10 tips sounded better than top 14 tips, but I actually have 14 tips. So I'm going to give you guys four more. So another thing that I love about cursor is it indexes uh your codebase immediately. When you download your codebase, 80% of it at least gets indexed. Uh and then from there when you add new files, it'll adjust. When you delete files, it'll adjust. Sometimes for really large or complex files, it'll leave it out just for the sake of performance. Tip number 12, cursor slack integration is awesome or any type of AI for these really small things where maybe you make a PR, somebody comments on it, hey, you forgot to change this styling format or they're like, hey, can we up update this config variable? Uh, makes it a lot easier. You just tag cursor and you don't need to go into your codebase to do it for simple changes like this. Tip number 13 is cursor browser. They recently launched this one too where you can actually see your app live next to your agent which is great because now it has access to the console logs and the network traffic and it can actually test your applications for you. And then tip number 14, use at your own caution. There's yolo mode where it's auto accept and you can tell the AI, hey, accept whatever changes I make. You probably want to stay away from this, but some cases that I've seen it useful for is when you want to write tests. uh you can tell it hey write some tests then code then run the test see if it works and then iterate and you can get a bunch of tests generated by the AI just looping back and forth testing your app okay so now if we've gone through cursor and cursor is just so great why do we

### Reassessing Assumptions: Shrinking the Process [21:10]

even need cloud code so I want to share a real world example that I had of cloud versus cursor and where these two tools shine in different areas so I can't get into the specifics of what I was doing but I was looking to implement a feature uh and I gave the same basic prompt to quad code and cursor. What cursor did is it just selected like a single solution um it told me about it and then it executed on this non-optimal design and then I tried cursor on the other hand and it searched the web for open source repos. It presented three different options for me with pros and cons, really high quality analysis, and it saved me a ton of time. Uh, and I think that's where cloud code really shines. Uh, in comparison to cursor for small changes, cloud code actually kind of sucks. It will overengineer things, research it too deeply. But when it comes to complex features and research, cloud code is a lot better. It does burn a lot more tokens, but I really think it is worth it. So if you have some big task um hard design you're trying to implement I think cloud code should be your buddy that you're chatting to. And then cursor on the other hand is really good for those cook outputs using their composer LLM. Uh if you want to try different LLMs to see if one doesn't work if another one might have an answer for you. Cursor shines there again. And you have all the visual bonuses that come along with it as well. Okay. And then section on cloud code. A lot of it is similar to what we talked about for a cursor. So I'm going to keep this one short. I'm just going to Oops. Sorry. go over the core four items that you probably need to know for cloud which are skills, sub aents, commands, and plugins. Uh skills, similar to rules, the ones that we marked, we had uh MDC's for rules in cursor, right? And we could check if they're auto applied or not. So skills are basically those auto invoked um rules that we have in cursor where if cloud should note something when x comes up when we're discussing y, that's when you would use a skill. Sub agents are explicit uh workflows. So if you want some specific tasks to be run, that's when you're going to use a sub agent. And what's really cool about sub aents u which we'll hop into later is you can give them specific access to different MCPs. And then we have commands which we just chatted about. Same thing on cloud code. And finally plugins. This is their way of distributing packages. So you can bundle together your skills, your agents, any commands you have and then make that a plugin and other people from your team or outside of your team can download that and use it. So starting off uh with skills again similar to rules but a skill that I have up here is converting a blog to the template that I wanted. I would have a rule here of if I'm generating a blog, how should it look like to match my company format? And it would make that conversion for me. Second is commands. We already chatted about this again as well, but a command might be a PR command where I tell it, hey, just create a PR for me so I don't have to type in the three, four commands it takes to make it. And then three, and this is the fun part, it's the sub agents, which I think is the really one of the big benefits of cloud. So we have our main cloud agent in our terminal chat, but then we have sub agents that it can call. So for example, uh if you're having a page, you can have a pager duty sub agent that has MCP integrations with Slack and data dog and checks what page has been called, goes and investigates the logs and tries to find a root issue for you. Or you can have a documentation sub agent where whenever you make some PR you can tell it hey go ahead and update the our docs according to these changes as well. Or you could have a Karen sub agent where you can ask it to go through Slack and Jira and see if everybody finished their task or not and then eliminate or uh

### Q&A: Regulated Industries & Legacy IDE Features [24:45]

notify the users. Yeah. So for sub agents, you want to use these when it's a very specific purpose uh that it has uh and you want it to have specific access to certain MCPS. And what's really great about these is they have their own context window. So it doesn't pollute the context of your main agent. They have their own context windows. Um and yeah, and then finally, plugins. Just a quick overview of plugins. Again, you can bundle together your agents, your skills, and commands all together. and just kind of a visual representation of what you can have and other people can come and download it. Okay, now we've gone through cursor, we've gone through cloud. Are these the best two tools? For me personally, yes. But it's definitely it's so head-to-head. It changes just about every week and it really depends on personal preference at the end of the day. I think two other ones that I say would say are really close to these are Klein and Codeex. Um, client is a little bit cheaper than cursor is, but I prefer cursor because of the composer model that it has itself and the indexing that it does on its coinbase that client does not. But I've heard a lot of people saying I get way better results when I'm on client than cursor. So try both of them out on your own. See what you like better. Same with cloud code versus codeex. I've heard a lot of people say I get better results on codeex than I do cloud code. I personally think cloud code is a little bit better at knowing when it's wrong and not confidently saying u some answer and I feel it goes a little bit deeper in the educational aspect of it will explain what it's doing to you better than codeex does but again I don't have any statistics on this is like personal experience you'll have to try it out on your own and see what you like better and then finally anti-gravity from Gemini which just came out yesterday I would highly recommend this one too this one's really cool because this is the first time we're seeing a foundational model, build an IDE. So now you have access uh to Gemini on the foundational model with Google anti-gravity, but they're also allowing you to tap into ChatgPT, Claude, and other tools. So I feel like this might be one of the best ones because of this reason. It's like kind of the first of its kind in this way. Now, a couple of bonus categories. One, I think actually even beyond coding. So I' I've been in industry for roughly about a year now and this I think has helped me more than anything. Having AI u for not only docs but just explainability so many times uh there's just documentation lacking at a company and I'll just use cloud code to walk through it and chat with me and explain the codebase to me and that really helps me in areas where otherwise I might have had to ask other people for help. And one tool that's really awesome is deep wiki. So it's AI docs for any repo. They have over 20,000 uh repos on the web already indexed. I was working on a project this week that had zero documentation for their open source. Not zero, they had like one page, but it was quite terrible, but this deep wiki really it wasn't perfect, but it really helped me out in building my project with these AI generated docs and it has a chatbot next to it where you can ask questions uh to the documentation as well. So, highly would recommend this. Second is AI code reviewer. I honestly to be haven't gone too deep into any of these. Um, I've heard that Code Rabbit is the best, but just in general, having an AI code reviewer does save a lot of time and it can help you catch some of those small syntax errors, styling errors, or if you have specific formats on certain PRs and certain checks you need to make, these are great for that. Another thing is low code tools. I think part of our responsibility as developers is to try to help out our non-developer friends at work, right? So introducing them maybe to cursor but more likely to tools like lovable and na then because this is really like a superpower for them and if you can just introduce it to them and give them a little tip like they're going to love you forever because of things that they're going to be able to produce off of this. So I highly recommend that quick show of what Naden is. Naden is basically workflow um like you can build AI automations and workflows on NAN. What's really awesome about it is it's very low code. They have integrations with just about any app you can think of. So you can hook it up to your email and you just like click on that node and it's a drop-own box where you click different variables. No code really necessary. But if you do want to build something more complex, they have these JavaScript and Python code modules that you can put in there and you can actually go in and type code. Um but again for non-technical business people, this helps them set AI agents up pretty easily and has a lot of impact. And then the other thing I want to go over is evaluating impact. um how do you find out if AI is actually having a productive impact on what you're doing and what your company is doing? And I think I've seen a lot of people try to figure out the right metric for this and I think the conclusion is there really isn't any. But what's more important than everything is just finding different metrics that you can track so that you can reference them later when you need. And a lot of times you will have a story that you can tell towards your through your qualitative experience and these metrics just will help you back up that story when the time comes that you need to present it. Finally, uh evaluating costs. I didn't really go into this too much um partially because a lot of you will be spending on company money anyways but also because also because I think um most people are underutilizing AI right now and it just makes sense to overspend. Go ahead invest highly into it. Overspend for the first 6 months see what the results and the gains are and then from there adjust and cut back if you need. Uh but a shout out is the Kimmy model is uh particularly good uh for lowcost high quality outputs. However, I will say if you're using the Kimmy model with cursor, the way they do their tool calling is different. So you won't get its full capability. So I wouldn't recommend it on another LLM other than its own CLA. Uh and a couple other things I wanted to touch on from that Stanford research paper we talked about earlier. These productivity tools are going to be different depending what you're working on. uh especially on green field low complexity tasks, you want to be using AI almost every single time. That's where the majority of the productivity gains are made. But if you're working in brownfield high complexity tasks, like sometimes you might want to ditch it. It might not be that helpful. You can try, but it really depends case to case. And then the other thing is also um language popularity. Python, Java, these highly popular languages perform a lot better. If you have some older, less popular languages, it's going to be hard to create anything. And then I want to go beyond uh writing code. So last week I got to um be a speaker at this event for a bunch of CEOs and as part of that CEO of data bricks, Ali Godzi was also there and he shared a story that like really stuck with me and I wanted to share with you guys. uh he talked about they build these connectors at data bricks and typically it takes them four months sorry four quarters to launch one of these connectors but some new AI tool came out I don't know what it was but he said there was this new AI tool he went home tried it and in about a day he was roughly about to get that connector working like not 100% maybe 80 but he's like okay this is great took the tool passed it on to his teams he's like all right guys let's cut down the time they went researched the tools came back and they're like yeah it's good we can cut down from four quarters to three quarters and and he like he was like I don't really understand why I tried pushing back on it a little bit but they were like this is just what it is because of XYZ reason we're not going to be able to do it. So he's like okay I gave up. Um, so yeah, he wasn't really sure, but they gave up. And then, sorry I keep hitting my mic. But then he had this one German employee that he talked about that came in and completely revamped the whole process. And through a couple of things that he did, he was able to take the four uh quarters for one connector time down to seven connectors in one quarter. So that's 20 21 28x in productivity, something like that. Uh, so what are the takeaways and what did he say? top thing is that people are just people. We're all humans at the end of the day, even the best of us engineers, and we're resistant to change. So, one thing that's really good when you're trying to make these evolutions in AI is bring in a fresh set of eyes and have them reevaluate assumptions. A lot of teams, we might not want to make it change within our own teams because it's going to cause us a lot of work, but what somebody else is coming and making it happen. They don't care how much work it's going to cost you. And in the end, that's going to lead to more productivity for you. Number two, he said there's naysayers and naysayers. And he didn't bash on the naysayers at all. He said they're both right. They both come up with logical reasons that kind of makes sense about why you should or shouldn't do something. However, with that being said, he said, "I've learned almost every single time if it has to do with pushing boundaries in AI. Find the yayser and put them in those positions to lead. " Why I share that for you is for the few of you who maybe aren't using the AI the most. If the CEOs out there are saying that they're going to put the yayers in the positions of power even for your own personal gain, you might want to consider uh even if you don't think it's the best option, exploring it a little bit because that is what your management likely wants. And then three is they treat software as actual software. He said the cost for coding is lower now than it has ever been. And sometimes it's the other tasks that are taking up a lot of time. And what happened in this case actually they found out that about 80% of the time for these building these connectors was PM work. It wasn't the actual software work. And that's how they were able to cut this process down by cutting out a lot of the user interviews, cutting out writing the PRDS. And they just took a risk. They said we're going to build this software. If it doesn't work, we'll just build it again because the cost of building this is so much lower than it used to be before. The cost really now is in the PRD generation. So just let's risk it. let's shoot at it and we'll iterate and build it again and scratch it if we need to. So yeah, that's learning number three. Shrink the process, not just the code. And then finally, uh, a couple of tips. He said, in every situation now, we want to reassess all previously made assumptions. There are so many assign assumptions we made about a lot of our systems that just no longer are true with AI today, but we just don't go back to re-evaluate them, which is where we miss out on a lot of productivity gains. So that's tip number one. And then tip number two is every company is dying to hire that German guy. So if you're trying to look for a promo or move up in your career, try to be that German guy. Every business person is looking for it. And then finally um I want to talk about AI absolutely imperfect. It can have a lot of downsides as well. Of course, like I'm not trying to make it seem like this is all that you can have changes, suboptimal design. It hallucinates very confidently. A lot of times your own skills may start to erode as you start using AI more and more which is another negative downside. There's security threats and sometimes you have dependency risks too where a lot of people will write some code where they have no idea how it works and now when you need to go fix it you have issues coming up. Um so yeah definitely trade-offs as there is with anything in life or in software. But overall the gain is going to be worth the trade-off. And the takeaways I want to talk about is one, don't just look at using AI to speed up your coding. Really look into those tasks beyond just the coding to see what you can speed up. Two, hopefully all of you that come out of this session will try an AI powered IDE and CLI tool. And you might be shocked. Uh I think a lot of the apprehensiveness against these AI tools also comes from somebody tried it out a year ago. It wasn't really there. You never really went back to try it. But now in 12 months they've made so much advancements that you might really be shocked, especially with like something like cloud code. Um, and then four, add some rules, add some skills, try them out, find some repetitive tasks, see how it works for your workflow. And then five, continue to reassess any previously made assumptions you have in your workflows. And that's all for me. I'll open it up to Q& A now. Thank you. — Oh my gosh, we have so many questions already. I want to start by thanking you for this presentation. I'm going to record it. I'm going to upload it to our Netflix education service. All of our developers need to see this. Um, also a reminder, please fill out the survey on your app and let Sapphire know if you have any feedback for him. So, first questions. So when you were talking about claude um versus cursor you uh had compared those two on the same prompt but was it the same LLM model underneath or different fundamental models? — Correct. It is the same LLM model underneath. There is additional fine tuning that they do on cloud code which makes it go a lot deeper than it would typically go um with just cloud on cursor and you'll see that by the amount of tokens it uses as well. Good question. So uh my question is uh since you're working in Coinbase, I'm thinking it's a regulated industry, right? So which part of your business function do you use uh cloud code or any uh any of these tools to write code, right? — I can't talk about Coinbase specific stuff at this event unfortunately, but in general things like writing PRDs, any documentation writing, research, planning, all of these things I typically start with cloud code. or cursor. — I don't know if I answered your question exactly. — I was just asking from a coding standpoint. So these are all like from documentation and those stuff, right? — Oh yeah. Oh, I can share this. At Coinbase, we are using a cursor and cloud code. — Okay. — Yes. — Uh hi. So you mention about uh you talk about uh AI native ids right cursor this stuff are how mature are those IDs in terms of legacy features like do we still have features like I use uh jet rings rider a lot and they have a lot of refactoring features in there nice things to analyze your code and are those do those IDs provide that as well. Are they mature enough or are we just relying on the AI capabilities to do everything? Like if I want to refactor a class name for instance, am I does it have a deterministic feature that goes and refactor everything or just going to ask the agents to do that which sounds like a waste of energy I guess to ask AI agents to do a renaming which is pretty standard in the industry. Yeah, makes sense. To be honest, I haven't seen the refactoring tools on cursor. I'm not sure if they exist or not, but Intelligj also has their own idea um AI model which you can use. And I think with refactoring AI is really strong with it, at least for the cases that I've used it for. But like you said, like maybe it's like a waste of tokens that you're putting into it for doing those things. So I'm not sure. I'd have to check the cursor refracting. Uh — okay. Thank you. — Of course. I just want to add to if you are using an IDE like writer, you can use cloud code inside of your terminal inside of writer which is something that our Netflix developers do as well. — Awesome. — Dude, any other questions? If that's it, I also have some uh QR codes up here if you want to connect with me on LinkedIn, but I also have some marketplaces for cloud plugins and cursor rules that you can check out if you just want to see what other people are using uh and what's been working for them. — Perfect. Thank you everyone. — Thank you.
