Agentic Engineering - What production teams are replacing vibe coding with
41:40

Agentic Engineering - What production teams are replacing vibe coding with

Domino Data Lab 15.04.2026 5 просмотров

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
Vibe coding is fast — but fast doesn't mean production-ready. Learn how agentic AI workflows and an mlops platform mindset close the gap between prototype and production. 1. Why vibe coding is repeating the notebook-era model graveyard pattern — and the MLOps platform disciplines (version control, testing, audit trails) that prevent it in regulated industries like life sciences and financial services 2. How to structure agentic AI workflows that actually ship: spec-first development, iterative story-driven builds, cross-model QA, and governance-first validation that ensures model reproducibility and traceability 3. What "quality" means when AI writes the code — a paradigm shift in standards, human oversight, and AI governance that enterprise teams in 2025 can't afford to ignore Timestamps: 00:00 — The Prototype Graveyard: Why Vibe Code Doesn't Reach Production 04:00 — We've Seen This Movie: MLOps Platform Lessons from the Notebook Era 07:30 — Version Control, Model Reproducibility, and Governance Built for Regulated Industries 15:00 — Agentic AI Workflows: Spec-First Engineering That Actually Ships 24:00 — What Is AI Governance for AI-Written Code? Redefining Quality Standards 33:00 — How Domino Handles the Plumbing: Infrastructure, Audit Trails & Access Controls 38:00 — Should You Build It? ROI, Business Value & Agentic Engineering Principles 🔗 Request a Domino Demo: https://domino.ai/demo/agentic-ai Ready to move from prototype to production without rebuilding your infrastructure? Join us at Rev, where we explore all these topics and more: https://rev.domino.ai #AgenticAI #AgenticAIWorkflows #AIGovernance #MLOpsPlatform #DataScience #vibecoding #agenticengineering

Оглавление (7 сегментов)

The Prototype Graveyard: Why Vibe Code Doesn't Reach Production

All right. Uh, Agnes, are you able to uh see the slides here? — Yes, I can see them. Good to go. — Perfect. Um, so we're at uh 100 p. m. Eastern here. So, let's go ahead and uh kick off. I'll just start with a quick intro. So, Jared VR here um field chief data scientist at Domino working with a lot of Domino's customers to help implement um business intelligence, data science, and AI solutions on top of the Domino platform. And then I'll kick it over to you, Agnes. — Awesome. Thanks, Jared. Hi, everybody. My name is Agnes Yun. I am a senior sales engineer here at Domino and I primarily support our life sciences sector. Um really excited to talk to you today about some of these lessons as you know this was a conversation that Jordan and I were actually talking about with related to some of our clients and then this ended up resonating with a lot of folks on our team. Um so yeah really excited to talk to all of you here today. Cool. So a new prototype graveyard. This conversation again actually came up with the thought of you know vi coding different applications uh the impact of AI really thinking about how this affects a lot of the different industries that Domino serves especially those that are more regulated and uh very compliantheavy. Um, so we know that again AI is making it easy to build software, but even though that's the case, that doesn't necessarily mean it's always going to be production ready, that the quality will be there. And so with that in mind, we kept talking about bipoding and how that's obviously unlocked a lot for uh Jared and my team here at Domino with the sales engineering team and solutions go to market team. Um, and so yeah, we were be better able to understand how, you know, you're able to deliver extremely efficient demos and really effective tools, but in a lot of ways there might be a lot that's missing underneath the hood. Um, so yeah, Jared just kind of fostered that conversation with me and this is kind of where we are and what we thought about as kind of like the initial kind of current state. — Yeah, this is definitely an interesting uh topic area. I was just looking through LinkedIn this morning and I say there was at least a handful of posts about people on one side or the other of this I guess debate if for lack of a better term around you know VIP code being fast and being able to use LLMs to do things that you weren't able to do before or if you were able to do these things as a developer you're able to do them faster. Um, but yeah, to your point, Agnes, like where does that kind of sit in terms of um actually deploying solutions that are really valuable for the enterprise, right? Um, and I guess we'll kind of touch on that today, some of the patterns that we're seeing, how to address vibe coding and the concerns around that. Um, especially in regulated industries where you can't just push out junk there. there's no risk um or there there's no um area for any risk, right? Um really any industry as well. Yeah. — Awesome. So, we've seen this movie before and as Jared mentioned again about a decade ago, data science notebooks faced the same exact problem. I think for us, we can recognize the patterns as they start to emerge with these trends in technology itself. So, again, you have brilliant models in your laptop, but again, none of them make it to production. They're lacking version control. Testing is quite limited if not no testing. Uh monitoring can also be limited. Hence that means there's no path to production. Given that we're

We've Seen This Movie: MLOps Platform Lessons from the Notebook Era

serving the life sciences industry, financial services and insurance and also public sector, we need to make sure that you know all of this is something that we could provide. Um and again each project might be a custom journey and that could just be very time inensive. So again, that model graveyard then is what's happening now and we need to make sure we prevent kind of repeating that same exact problem. — I love this pattern that you identified here, Agnes. I just think it's spot on in terms of, you know, we were really struggling 10 15 years ago when kind of data science became the new term. Even though people were building models before that, but we started getting access to all this big data technology and were able to build, you know, more advanced machine learning models and it was a lot of kind of research. Um I remember I had a boss who would always say PowerPoint is where models go to die. We would build all these amazing models and then they wouldn't get to production, right? And that path to prod is so important. And even though you know there are some nuance differences here but the really the same thing right around there's this expectation of controls testing version you know version control around that monitoring all of these day2 items that need to be in place um in order for software to be reliable and whether that's a model or an application or a data pipeline if that's missing um then there's an issue and you know I think industry did a really good job of solving if I kind of think about the buzzword it was like big data for a long time and then MLOps came up right and then there was this whole movement around um — both on the process side. So what are things that you can do as a data scientist to make sure that your projects are set up in a way where they can be moved to production. But then there was a lot of tooling that came out as well. um Domino a leader in that space. I you know I assume you're using a lot of these ML obstacles or um items today, right? In terms of both the process and the software that helps accelerate that. — Exactly. And I think just like you were saying like we see these trends in industry changing so much especially with like the key words that are shiny and like the next greatest kind of like approach and process. I think what we need to keep in mind is how do we keep it scalable and functioning and you know endurable like durable. I think that's the key word we need to keep in mind as so much is starting to change. Um yeah it's definitely the MLF's process problem in disguise just as you've have it like listed here. — Yeah. So as I as I kind of tease that apart right I think that there's a process piece. So what do we do as vibe coders or people using LLM um to help develop code? So that there's that piece of it, but then what is the tooling that's required? Does it already exist? Does it need to be um changed in some way to support these new patents? And we'll touch on that some of that here in the next couple of slides. — Awesome. Perfect. So if we are to kind of retrospectively think about how we address the MLOps lesson from before um and how we were able to close that gap between just great work and also production software for first and foremost we had version control and reproducibility uh making sure that you can rerun the work tomorrow making sure

Version Control, Model Reproducibility, and Governance Built for Regulated Industries

that you know you have that fidelity of all the different commits and changes and you know exactly what you did. You're also able to test and validate the work that you're doing. So quality is something that you can prove and not promise. How do you ensure that it's constantly reproducible every single time with your data with your analysis with the different results and artifacts that you're expecting? It needs to be solid and locked blocked. And lastly, when it comes to, you know, governance and audit trails, it's something that again all of the key regulated industries that we support um you know can have access to all of that so that they could actually trust the work that you're delivering. and also Domino as the platform itself. And lastly, just Domino as a whole. We provide of that infrastructure capability. Um the most easy way to think about it metaphorically is like plumbing. It's just it's something you don't have to handle. So we could do it on your behalf. So again, your team can focus on higher order um activities and processes. And so that's the way I think you know this has helped us in a lot of ways. Um yeah, excited to hear your thoughts here, D. — Oh, it's just so applicable today. the reproducibility. I'm kind of thinking back in the MLOps days or even before that big data days and then Kubernetes became a thing, right? And you were able to containerize workloads, but how much of a pain that was. Um, now we have these platforms like Domino which make that so easy to um spin up and you know go away from a project for two years and come back and have to retrain it and you're not having to deal with Python package differences and all the rest of it. it's already kind of pre-built and lying there for you in an existing environment. Um, but the these are all, you know, really important things today and like OpenClaw being the um example of like people letting it go wild in their systems and it's like deleting code because it's not containerized or they didn't version control. I just was reading something the other day of someone saying, "Oh, it went deleted my system my uh all my code and I can't get back. How do I do that? " Right? And that's version control exists for that reason or — pointing it towards production databases and it's gone and deleted a production system because there was no governance and there's no way to back that out. So um yeah it's it again really great kind of identification of patterns that you identified there from MLOps. Um, I think there is a way to solve this and I follow Andre Kapathy a lot on X or X ex Twitter, I guess. But, um, you know, vibe coding was a term that he coined, I think, um, 2025 around March that kind of really took off. But earlier this year um he had another post which did not take off as well. But I think it hit a core point which is um there's a difference between vibe coding and you know talking to AI or going through natural language and asking to build a basic website or app and kind of playing around and seeing if there's a there there. I think there's a lot of kind of value from getting to that speed and being able to get it in front of users. Um, you know, test the hunch. But then the flip side is how do you go from that to something that can actually be deployed in the enterprise, be able to address thousands of users, be able to be secure and um, you know, maintainable so that you can add features, right? um you hear a lot of people I built this entire solution and then I needed to add a feature and I didn't know what was going on in the code and it just broke everything and couldn't move it forward and that's where this um concept of Gentic engineering kind of came up and that term really hasn't taken off yet but I think that there is a there in terms of there's going to be something different and that's what we're going to kind of highlight in the next few slides which is um you can build this code really well, but how do you build it in a clear way? And that's starting up with a very um clear specification document. Um so really understanding the nuances and spending a lot of time in the planning phase. Um making sure that you're building for real users. How are they going to access off into this system? Is it going to live in the um public um cloud or or somewhere else where you're restricted by single sign on or some in internal networking within your organization. um the testing strategy right around um not just you know does this work but really getting into like traditional unit tests um integration test end to end test opening you know playright and so forth you can use to actually open the browser and click through things but beyond that you know the security and governance controls that need to be in there and what what's really interesting is governance goes beyond is this regulatory compliant. That that's an important piece. There's other techniques now that are being used in vibe coding. So, you can actually um like I'm thinking about a project now where I had five different components that each had their own um individual responsibility and that the LLM would be shifting features around, right? It became a complete mess. There was duplication, right? So you can actually apply governance into your codebase to make sure that the um responsibilities of each component are complied with, right? So governance starts to take on a new um meaning and then how do we kind of ship this software in a way that users can trust? Um, are you seeing the same thing, Agnes, on your side in terms of this kind of a split between the vibe coding and then aentic engineering? And I love your thought of are they two completely different things or do you go from vibe coding into aic engineering? — I think in a lot of ways there's a lot of overlap. I think the two right now, especially in the life sciences kind of industry that I'm in, they're separated. Vibe coding is more of like an exploratory path that I feel like a lot of my clients are actually working through. In some ways with the use of coding agents, I think it has helped to democratize access to programming that let's say statisticians or certain bench scientists might have never had before. So that definitely unlocks up capabilities, but at the same time there's like a huge learning curve as a part of that vibe coding effort, right? But also if you think about it from a traditional engineering perspective and being able to get to prod and doing all the system validation and kind of thinking it from that perspective with QA um you need to keep to you know this the structure in a traditional sense right like having those specifications um and providing those design best practices to kind of follow suit. So I think there's an interesting kind of merge and interest in it. Um but I think again with the sector that I serve uh it definitely leans a little bit more on production engineering and there's a lot of room for growth there. Um just to give again time back to the key scientists and bench scientists and data scientists and developers um to automate a lot of the work that AI can support with kind of as like that partner rather than like a separate siloed tool if that makes sense.

Agentic AI Workflows: Spec-First Engineering That Actually Ships

Yeah, absolutely. Like you're a pair, your programming pair that helps you move quick, but you got to keep an eye on uh as well, right? — It can't be left alone. That's for sure. After what you said about Open Cloud, that that could open a full other can of worms. Um but yeah, definitely spot on with what you said about kind of it's emerging world right now. There's a lot in flux for sure. Cool. So if we are to think about where the real work lives now, um again Jared was talking about how you know with AI being able to write code in seconds and being able to prioritize those higher order capabilities. What does that look like? Being able to you know define the code, understand what the actual business process looks like, understanding what the user needs are and success criteria is absolutely key. It might feel cumbersome but just as any traditional project or any sort of study endeavor you need to clearly understand what the mission objective critical uh what the critical mission objective looks like and to understand what's following suit and then afterwards that's where you build. Um again this is where the coding piece happens to help democratize and make that piece move faster as an engineering partner. And then after the verification, you need to always trust but verify. Um, I think that's a key phrase that our internal team has been joking about, Jared, but I think it's true. It's like you can trust some of the work that's being done, but how do you really know it's accurate? And I think in a lot of ways, um, you need to keep the traditional disciplines of security in mind. You have to keep testing it fully in mind with the unit testing, the systems integration testing, blackbox testing, everything that you would do in a traditional engineering project. That same diligence needs to apply here. And lastly, for any sort of validation that needs to happen in that governance piece, how do we ensure everything is thoroughly traceable, that it's repeatable and reproducible? I think we need to keep that top of mind um with the shift that we have. — Yeah, for sure. I I know there's a lot of talk out there of LMS replacing developers and um there's definitely you know software development is the area that's getting disrupted first by LLMs but it's I don't see it as a replacement I augmentation and a shift of where people are actually required right um as we all know LLMs are trained on the public internet right so they don't know your internal systems. They don't know the nuances of your business other than what's been um published publicly, right? And so and the short is there's no openw weight models. There's a couple out there now. Miniax and a few that are getting quite good, but you know, Claude Opus and Codeex are definitely the kind of way to go and the cheapest inference at least for the time being for these coding um type projects. So when you're using those models, right, it really pushes us to um work on those integrations, work on where the business value is, right? Understanding how does this actually move the needle on core KPIs that's something an LLM today can't do and for the foreseeable future. um a lot of enterprises that the data models, the nuances in joining keys and security between different systems and none of this is even documented well in most places, right? It's in people's minds. So, this need to kind of tease that out um before starting any of the code. And then if you can just like move really fast, right? The other thing is like if you're trying to hit this bullseye of the perfect product that's going to drive value, right? If you don't do all this planning and you're moving so quick, you could just totally miss the mark, right? And you you're way blown way past where you should be. Um so to mitigate all that risk is really pushing a lot of this upfront. And I think a lot of good software development already did this upfront. Um I think that the difference here is now documenting and really doubling down on this so that once you get into this build phase there there's you know a two week sprint now is probably going to build the entire product versus before you know it might take six months and there was time to to readjust there. So different ways of thinking and then the QA is just getting huge now with the these LMS models and we're going to talk about another topic here in a second which is controversial about what does that QA look like but at the end of the day you do need to go through the testing security governance. I was just reading an article on a company that said, you know, we vibe coded this really great reporting system. We were pro promoting the results up to the CMO who was talking to the board about it. We made investment decisions for you know over a quarter um on where things were going and the person that was writing the article said that um you know after some time she was digging into some things that quite didn't make sense and then started validating and found that the code was just creating um basically fake mock data and still thought it was in test mode and they thought they were in production. Right. So um — yeah it's not that the LM's are lying but you know like um clawed code will sometimes modify the memory files and not modify the real code right and unless you're paying attention there and staying on top of it and doing that kind of real world validation these things slip through which is you know different to what it has been in the past. Yeah, definitely. There was one point Jared that you meant earlier and I just wanted to talk about it very briefly. I think it's the notion of having to document all of the work now that we're now doing to help support these coding mants and using LLMs. Um that even like 5 10 years ago when I was like a nent like developer, I absolutely just tested documentation and I know most developers hate documenting too. So if you think of the tools we have now, it unlocks so much more where again you can actually comment your code very diligently. You can leave really nice detailed readmes and I just think of that capability as al also already just being like a huge advantage to just the community now. Um so I wanted to bring that up. We're in a different state than we were maybe 10 years ago. That off the bat is huge. — Yeah. I don't know. I have mixed thoughts about that. I forget the agile manifesto. It's been so long that Right. It was action over documentation or one of the — exactly but um I do see this problem that's starting to occur of like a wall of text right just because you can write something like who's going to consume it and I think more and more that it's that is the handshake right less of the code being readable but the documentation being readable so that there is some need for humans to interact at that point but there's also this area of documentation that needs to exist for the LLM to work really well and skills have emerged over the last 6 months as a way for LLMs to you know pull some context into the window and um drive towards certain outcomes or patterns um also within the code right um so I just mentioned about the this project where I had five different components I had different readmes within each I then had a structured readme over the total and there It's less about like, hey, I'm preparing this because someone down the road's going to read it, but it's more of like, hey, I'm trying to give the LLM context about the code base in different areas um so that it can kind of manage it itself and be more performative. But um I do see this issue where um you ask folks for their insight or a document on something and something that would have been two or three really concise paragraphs in the past are now 30 pages of just um — fluff. Yeah. — And you know there's some value in that and that there are real nuggets that come out but it's just it there's a gap there and I yeah I see it happen. — Yeah. It has to be intentional and I think that segus really well with the side that you have pulled up here right now. Um, making sure we keep the standards and best practices of the traditional engineering sense, but again applying it with AI to kind of train it as a collaborator to kind of get into this. Uh, but yeah, go ahead. — Yeah, I just, you know, we've done kind of a lot of tea up here, but I want to get to some like actionable what can you take away from this. So we're doing a lot of um work within Domino around, you know, how to build

What Is AI Governance for AI-Written Code? Redefining Quality Standards

really high quality software um using these tools very quickly. And the first is understanding where LLMs are good and where they're not. Um it's kind of out of scope to get deep into that here, but they started off being really good at building websites and then applications and then now they're starting to get better around the infrastructure management deploying. But that that's where it starts to get really high risk right around you know deploying Kubernetes or AWS and things can go ary there where you end up with big bills or security leaks. So that there is this top down of being really good and it's getting lower and lower um in terms of kind of where they they're good. Um the other thing I think that's important when you're kind of getting into these projects is this is not kind of just hands off at any point. I read a lot of things of I had 30 agents running or my open claw ran overnight and connected with Twilio and all of this. Go actually connect with Twilio. Like you need to go through this whole regulatory approval to get a phone number approved for text or calling that. It's just not. So that's all fluff when people say that. The real high quality code is you're using it as a pair. You come up with a specification first, right? And this specification is going to start um you know more like a readme of a typical repo but then become very in-depth. And some of my specifications and the ones I'm seeing with internal to domino 2 3,000 um word or lines and sometimes even larger but drilling into every piece of it. What um programming framework or language are we going to use for the app portion right? React? what is the backend database? We're not going to use Reddus as a cache, right? Which you'll notice Claude ends up pulling in Reddus and a bunch of, you know, um, Rabbit MQ and other items that, um, you didn't ask for. So, be very clear about, you know, what is the stack, what is not the stack or out of out of, um, scope at all different levels. And, you know, this is an iterative process. You work with Claude or Codeex um, to help go through this problem. I actually find codeex better for the um specification then move into to claude for the next sections but kind of iterate and get that really tight. Um the next thing is I take that specification and then build a set of epics and stories from that and these LLMs tend to be very lazy about doing that. So you'll say go through line by line and create a you know set of epics or groupings and then stories with what does the test look like? what is the outcome of that specific feature? And again, you'll have to iterate through this multiple times. Sometimes I'll say, I want at least a thousand stories, right? Or try and judge how many stories to guide the LM so it doesn't get lazy. Um, but then, you know, stress test the spec and the stories going back and forwards actually manually reading through them, go through them with the LM and tweak. it's better that you spend a week here um getting that really dialed in than just jumping into it, right? If you want to build the high quality piece um then doing the iterative implementation and we've got a prompt um as part of the blog which actually has the full details here if anyone's interested. We'll highlight that. Um but this iterative implementation is essentially going through um each of those stories making sure they're unit test once they're at a point for integration test to make sure they're doing that and they're end to end. So you're really building the tests as you go. And again this is not fire and forget. This is you want to sit there and actually watch and kind of uh nudge it in certain directions as it's going so it does not um go off course there. Um, and then finally kind of doing a cross check of models. So I'll use codecs a lot for the specification the stories go back and forwards with Claude um to test that. Claude's really good at cold start software implementation. Codeex tends to be better about coming into a um already existing database and or existing um repo and cleaning it up. But then doing that cross check across models. And I heard a trick the other day that actually works really well. If you tell Codeex that Claude wrote this code, it'll find a lot more stuff than if not the vice versa, right? So giving it some context. And the other trick I'll do is I'll say, "Hey, you're a fang engineer that's trying to poke holes in this codebase to go tell, you know, against another. " So give it some context around hey take this perspective or I'm a security um or a hacker trying to break into this system or I'm a compliance officer and kind of doing those different persona viewpoints into the code is also um helpful there. Are you using the same process Agnes? I know we talked about this a little bit. There's a few different ways to approach this. What does your process look like and where does that diverge from that? Yeah, for me I think I actually spend the most time on the specification first. I'm giggling because I remember as like a young engineer, I remember being so nervous when I had to give like a code review in front of like senior staff devs and they just like ripped my code apart. And so when I think about things like that and when I think about, you know, user requirements and all of that, I want to make sure that there's full context. So I usually spend the most amount of time in my specification first. And then for the stress testing piece again given my vertical there are so many use cases and edge cases that I usually want to identify. So then that's the second one is usually where I spend the second most amount of time. Um I definitely have to do it incrementally like I run into issues all the time. So having tried to do a big swing approach that never worked for me and it ended up failing and causing more rework. Um, and then for the layer testing and cross training across models, I think one thing that really stands out to me is every model is so different. Each has its own strength. So to be able to leverage each of those, it's just work smarter, not harder. It helps you just move a lot faster. Um, but yeah, I think everything you also said resonated with me. It's very relatable. — Okay. Fantastic. — Well, this is the hot topic here. Do you want to kick us off and then uh — Yeah, for sure. Okay. So what exactly does quality mean? And quality actually means something a little bit different. Now um I know this might be quite um like a paradigm to consider a paradigm shift to consider but when AI is writing and maintaining code as you said uh the rules of good code are shifting we have to give it a different standard because uh the rules and conventions in which we followed look slightly different now given the level of automation and different level of trust that we have to provide to you know the content that's being generated. So I think if I were to focus what's mattered less for me personally as I've tried to you know uh use more of these solutions and also leveraging AI more um definitely like the file structure like pitifying your code anything that's considered you know not in the proper syntax format and all of that it matters a little less because I'm assuming that it'll always help catch a lot of those issues and also those line by line code reviews we don't necessarily need that as much and also being able to you know have those internal abstractions. I think it matters a little less. I think in the context of where it matters more. Um, obviously code generation is a bit more seamless now. Supporting a lot of different languages for use cases, but being able to tie it back to the specification, why, I think that matters the most of why did you even embark on this? What was the key use case? What were the requirements? If you're not able to do that, then what you created isn't great. And also the security piece that we kept talking about with governance, with auditing and compliance, that matters twice as or three times as more now, I would say, because you know, if you're not able to trace back and be able to provide kind of like a justification for what you did when, it's not going to matter. Um, and also finally just like more human oversight at the end of the day, making sure you're signing off on this decision that you were a part involved in. I think that's going to better a lot more. — Yeah, to totally agree. I laughed as we pulled this slide up because this is an area that I just I think a lot of people are totally missing the ball on and that they're very opinionated about it, but they're not living and breathing in the value of what LLMs can do. but like transformative in terms of 10x

How Domino Handles the Plumbing: Infrastructure, Audit Trails & Access Controls

velocity improvements if you follow some of these agentic engineering approaches and do it well. But I ju I just yesterday saw someone I used to work with very senior um you know technology leader that a lot of people would know about talking about if you don't understand the code that the LLM's written you're in trouble and it's not high quality all the rest of it. that person is totally missing the the ball, you know, when call out their name. And I strongly believe this in the fact that LLMs use code in a different way than humans do, right? They prefer large files. We prefer small, clean, broken out files, right? We'll have a utility function such as a calculator that we can modify once that gets used throughout the codebase and it's a single management. An LLM doesn't care about that. It's indexing. it's changing. Um, you know, the naming conventions again doesn't matter. Um, so the these things that you're pushing this human um are our what we need to write good code pushing against an LLM is making it perform worse and you're actually losing that velocity and the value that it brings. It's a touchy topic because if you don't look at the code, LM write these abstraction layers and wrappers and you know, you end up with 10 times as much code even though you're moving 10 times faster and it becomes hard to manage. — But that's a artifact of you not writing the right tests. Like I mentioned the guardrail tests earlier, um you can actually write tests to see the complexity of the code, right? and drive it towards doing codeex and claw comparisons. Again, you can take a really complex, you know, kind of junk looking codebase and kind of clean it up. But again, you're not cleaning it up to human standards. You're where it's kind of the most elegant re optimized and maintainable code for an LLM to maintain, right? And so I think this is really critical and if you're in an organization where there is still a human reviewer that's looking at it, you're going to have to drive more towards moving slower and manually going through it. You'll still get some advances. But the organizations that I'm seeing that are AI first and you know anthropic, perfect example, is pushing out features every week. There's some issues sometimes, but they're moving quick. Um, and it's representative in the stock price and their impact to society and the value that they're driving, right? And so that that's the perfect example of that there is a way that you can do this. And I think it's starting to get tightened up. But um, you know, folks that are saying, "Oh, it doesn't look like human written code are totally missing the ball. " And I would say, "You do not know how to read um, compiler code or know the C code that's underlying Python or anything. " So it's like a mute argument. Um I'm just going to quickly get through a couple of slides because I think um we're kind of running out of time here. But as we mentioned earlier, there's a lot of kind of repeat plumbing that's required and LLMs are still not really good at the authentication, the login, making sure you have the right access controls, governance. There's a lot of um items around the underlying compute. Um, so getting down to the infrastructure and those containers we talked about around reproducibility, the audit trails, right, and the controls in place so code doesn't get deleted or someone using this agent now gets access to data that the agent has access to that they don't. Um, so you know, being able to propagate through the um, authentication um, you know, being able to deploy these things very quickly. We have agents at Domino now. I was running last weekend where I was predicting um like essentially um the masters tournament and playing around with Coinbase a little bit and was actually building workflows to go out and you know — basically hit by hit um monitor that and it's executing but that was essentially spinning up infrastructure and so forth. So all these things wouldn't be possible on those type of projects. And there's platforms like Domino which we say do 80% um of the plumbing or I like to think about as the building blocks are there and it allows you to focus on the um business problem the business outcome but also are able to use these um LM generated code in a way that is safe, secure and governed. Um, let's hold on. Let me go through this one quick too because we've kind of touched on it a little bit, but just because the velocity is high, um, you do not want to miss that target there, right? And so the question is always just because you can build this, should you, right? So, as Agnes mentioned, really focusing on what are the ROIs, the success criteria from a business standpoint, how is this going to drive

Should You Build It? ROI, Business Value & Agentic Engineering Principles

um, value? And it's the same trap we saw with MLOps, right? People were building hundreds of models. Um, but what were really the strong kind of outcomes and you know people were running around with an AI ML hammer looking for a nail when a lot of times it's a simple joining these two data sets together and building some reporting on top of that was valuable. And the same is true um today. And so I'll kick this over to you here if you want to help us kind of wrap up with the punchline before we — Yeah, of course. So again, VIP coding got us started, but tooling alone isn't going to fill that gap alone. It has to be strategic. It has to have a technical, you know, initiative in mind. Treat it as treat AI as an engineering partner and also as a novelty, right? You need to make sure that you can rely on it that also the business can rely on it. So not to take it for granted in that way is just a one-off. And also again strategy. You want to ensure that production is worth reaching in the first place and that's ultimately where you want to end up. — Fantastic. And so we have a um few other items here, but I we got to highlight this absolutely fantastic event that's coming up. Agnes, I know you're going to be in Philly, so why don't you uh kick this off here? — Yeah, definitely. So, we have Rev, which is one of Domino's uh flagship uh conferences that we host. Uh the Philadelphia one on May 12th will be centered around our life sciences uh sector. Uh we'll have uh Steven Han who's the former FDA commissioner as our keynote speaker in addition to other speakers from BMS, Merc uh agsk. Uh we have our New York one Jared which I believe you'll be at. — Yeah, absolutely. I will be in New York and the focus there will be more on um finance financial services but there will be folks from other verticals as well. And then um finally London I think is a blend of both if I understand correct farmer and finance. Yeah. — And both sorry I believe we'll also have Capital One with you in New York. We'll have Vivo and TIA as well as other guest speakers that we have and folks from the Federal Reserve Board uh David Palmer and Reed Blackman from Virtue as well and sorry Eric Gibson from uh Novartis. So that's our lineup of keynote speakers. We have a lot of great guests joining us that week. Absolutely. And there'll be a lot of um data science, AI and IT leaders who are kind of done with this experimentation phase and actually scaling and driving real outcomes. So there'll be a lot of good presenters but also discussions taking place around topics like what we've just discussed today of um vibe coding versus aentic engineering and how to use um conversational AI and coding assistance to really drive uh business value. So — it's again a free event for those who may not have registered. So please feel free to join us. Again, it's a celebration of our customers getting to better understand how they're using Domino uh to network with each other, see what's on our road map and other cool industry topics kind of like what we've just talked about today. Um so if you again haven't registered, please go ahead and do that. — And I think with that I think that ends our presentation. Jared, our fireside chat was really — with a horrible ad read there. I don't think we're getting radio ad jobs anytime soon, but hopefully not. — Absolutely not. I like the day job as is with you. — Good deal. Well, thank you so much for attending folks and uh we look forward to seeing you at Rev if you can attend. — Awesome. Cheers. Thanks everybody. Have a good day.

Другие видео автора — Domino Data Lab

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник