5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

22:36

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar 05.03.2026 11 227 просмотров 473 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Want to start freelancing? Let me help: https://go.datalumina.com/W9QrjUT Want to learn real AI Engineering? Go here: https://go.datalumina.com/5qdZYMX 🔗 GitHub Repository https://github.com/daveebbelaar/ai-cookbook/tree/main/agents/agent-complexity 🔗 Introduction to AI Agents https://youtu.be/bZzyPscbtI8 ⏱️ Timestamps 00:00 Introduction to AI Agent Complexity 01:44 Level 1: Augmented LLM 02:08 Level 2: Prompt Chaining & Routing (DAGs) 04:11 Level 3: LLM with Tools 08:37 Level 4: Agent Harnesses 13:29 Level 5: Multi-Agent Orchestration 16:03 Final Thoughts on Complexity Levels 19:28 What I've Been Working On 📌 Description Most AI agent tutorials skip the part that matters: what actually works when you ship to production. In this video, I walk through the 5 levels of AI agent complexity, from augmented LLMs and DAG workflows to tool-calling agents, agent harnesses, and multi-agent orchestration, with real traces from a live client system. All code examples are on GitHub using PydanticAI and the Claude Agent SDK. 👋🏻 About Me Hi! I'm Dave, AI Engineer and founder of Datalumina®. On this channel, I share practical tutorials that teach developers how to build production-ready AI systems that actually work in the real world. Beyond these tutorials, I also help people start successful freelancing careers. Check out the links above to learn more!

Оглавление (8 сегментов)

Introduction to AI Agent Complexity

When you're building anything with AI, one of the most critical things to get right in the beginning is deciding on the level of complexity that you need. Do you need a full agent system or is a simple LLM call or a workflow enough to solve your problem? So, in this video, I want to go over the five levels of AI agent complexity that we see out there right now. And then I also want to share what's working for us in production right now with the client systems that we're building and also how that has evolved over the recent months. Now, as always, I'm going to walk you through some diagrams, as well as some code examples for you to follow along with. And now, if you're new to the channel, welcome. My name is Dave Abalar. I'm an AI engineer with over a decade of experience in the field. I hold both a bachelor's and a master's degree in artificial intelligence, and I run my own AI development company called Data Luminina, where we help our clients to build and automate with AI solutions and systems. And here on this channel, I share all of the practical takeaways for you to learn practical AI engineering as well. All right, so let's get started with the five levels of complexity. Now, as a quick introduction to this video for levels 1, 2, and three, I will go for it rather quickly because these are concepts and patterns that have been covered in great detail on this channel already. I will link the details below. What's more interesting to me is showcasing you what's working in production right now and some of the newer things that we can now do with some of the SDKs that have come out and then in particular, how do we bring all of that together in a production system. That's the primary goal of this video. So just know that all of the code examples are available in the GitHub repository. I will go rather quickly through them. The goal is not to go line by line in this video. I'm assuming you already have a foundational knowledge of what we're covering here. And if not, you can always open it in your favorite AI editor and go through it step by step. That's really how you ultimately

Level 1: Augmented LLM

learn this. So let's walk through things to start with the most simple level. Again, I'm assuming you're familiar with this. This is what we call the augmented LLM. Here we make a single LLM API call. I have code snippets in the repository for all of this, but I'm not going to go through it because again, I'm assuming you're already familiar with this. We take a simple API call. We use structured output. This is how we can engineer systems around it. So, let's go

Level 2: Prompt Chaining & Routing (DAGs)

to level two. This is where we get into prompt chaining and routing primarily using what we call directic asyclic graphs. For the past 2 years, this is really what I've been preaching on this channel. Don't get into fancy agent systems. Just take any process that you want to automate and decide if you can classify the first step of the system, the data that's coming in. What are the categories? So, here's an example of a customer care ticket coming in. Is it a billing question? Is it a technical question? Or is it the general question? This is something that an LLM can classify and then decide on. And then based on that, you can have very deterministic if else rules to decide how to handle that ticket. And now what's important to understand as cool as all of these AI advancements and agent frameworks like open claw etc are in production in practice right now the dags the direct as click graphs are still the bread and butter of how you automate systems B2B reliably and especially at scale but one of the big challenges with creating systems like this is that whenever you start a new project you start simple you start with an overview like this it's very maintainable everyone one can understand what it's doing. Your code base is rather lean, but then you start to add more to it. You start to automate more and more and what you will then end up with can often turn into this Frankenstein of a directic as graph. So don't think the example over here, but think like a 20x or a 30x on that often with multiple developers working on different branches of the deck. Now there are various patterns that you could use here. You could do some decomposition. You can use microservices but in the end it grows in complexity and figuring out the right pathway and especially debugging something when things go wrong just becomes more and more complex. This is one of the biggest challenges if you are working on production systems for large organizations, big teams, lots of data, it's the complexity. Recently, what

Level 3: LLM with Tools

we've been able to do with some of our production systems is introduce more edge nodes with a series of tools. So again, I'm assuming you're familiar with tool calling already. There is an example for you to go through the tool calling agent. But pretty much what it comes down to instead of making the hard if else decisions or the dictionary lookup where we programmatically say this is what you're going to do, we give the LLM a series of tools. So it can look into a database. It can look up a policy and the agent, the AI, the LLM really decides what to do. It can also do this in a loop and even call multiple tools. Now this is what you would call actually agentic at this point. agents reasoning over tools in a loop. Now, what we found is that most people are kind of like in either one of the two camps. So, they say, "Yeah, you should just make agentic systems. You should just give it tools and let it figure it out. " And you have other people who are more on the camp, right? No, it needs to be a DAC. deterministic. But the thing is the best systems, they use a combination of both. And that's also exactly what we are doing right now because the models are getting more and more capable. But we still really want to start from that structured place like route and classify as much as possible and only use tools really as our last resort. Now how that works right now is I want to take this screenshot over here. Let me actually zoom in a little bit for you. So this is a screenshot I literally just took from Langfuse which is our monitoring tool we use for all of our client systems. on the left over here you can see the full trace of everything that this workflow went through and I clicked on a particular step within this workflow here we have a chatbt 5. 2 into chat and here you can see all of the tool calls. So if you see over here like even above this analyze ticket note there are still five steps before that. So I want to show you this because like I've said this is a system we've been optimizing for over one and a half years. It's a customer care support system. It automates pretty much the entire customer care team for a company only escalating really the tickets that still need human intervention. And it's just it works. it just runs. So this image I think says like literally more than a thousand words or tutorials because you can learn a lot from this. So what we did instead right now and that has been working really well at the edge nodes of in this case the denaware defect agents we give it a series of tools and that's what you can see over here on the right. So in order to solve this problem the agent sometimes needs to request the missing information like how many pieces were in the dinnerware set. Then it can also get the specific product rules. So for this product set, what are actually the rules? Meaning what what's in the knowledge base with regards to company policies as to like getting an exchange or getting a refund. Having this in an edge node with all of these tool calls now works really well. And then you can see if I scroll down a little bit here, you can see it decides to go in here and it decided that in this case it needed to request missing info. Okay. So I wanted to take you through this example to kind of like showcase the evolution really of a production system which always still starts with trying to solve the problem as simple as possible. Okay, we have more edge cases. Let's create a graph around that we can control. Oh, the project is really going into complexity. Let's focus on the edge nodes. And that's where we introduce some tool calls. So these are the like the fundamental levels of the systems that we have running in production right now. So the project I just shared is from a client we've been working with for a long time already. That's what we do with our development company. And now if you're an engineer and you've been thinking about this idea of taking on side projects as well, maybe starting as a freelancer or even your own AI development company, but you don't really know where to start or how to find that first project, you might want to check out the first link in the description. There's a video of me going over how I can potentially help you with that. I've been running a community for over four years already where we specifically help developers and data professionals to get started with freelancing. What to do, what not to do, how to find your first project, everything that you need to know to get started. So, if that sounds fun to you, make sure to check it out and you can potentially join us and work with us

Level 4: Agent Harnesses

together. Now, when we get to level four, this is where it gets more interesting. So, here we get into what we call the agent harnesses. These are the harnesses that power the tools. like for example open claw cloud codec cli and here we go even deeper than an LLM with a bunch of tools here we also give it access to a complete runtime so a full runtime where it can do bass executions it has file system access it can do grab search web search we can give it external APIs via mcps or just scripts and then in the codebase over here in number four agent harness I have an example of how you can do that with the cloud agent SDK. So there's a Python SDK for this. You can pip install it cloud agent SDK and this uses entropic behind the scenes but it's super cool because it gives you the same functionality and features that you have access to in cloud code but then actually within your own applications. So it can run bash commands, it can search files, it can go on the internet and this is also where you can start to feel like look this is a little bit tricky right because this is super powerful but using a tool like cloud code when you are behind your machine and kind of like you are in the loop and looking over things is one thing but having this in a production system where it can literally like crawl the file systems do a lookup make changes potentially crawl the internet it's tricky it's super powerful but that's why it's very experimental So I'm going to show you a couple of examples using the cloud agency SDK as the harness but you can also set this up with alternatives. You can build something similar with pidentici or lang graph. Um if you look at what openclaw was built on they used pi mono. This is a typescript library but this is the agent coding agent harness that they were using. But it's this is a very new and evolving field. So let's go through a couple of examples because this is uh this is really cool if you see this coming together. So I'm going to first like scroll down to the bottom so you kind of like see what we're going to do here. You can pretty much through that SDK set up a claw agent and there's just a whole bunch of parameters that you can fill in. So you can give it access to tools. You can say look these are the allowed tools, the system prompts, the MCP servers, the permission, max budget, even sub agent environment variables. There is a whole bunch of things that you can configure, but this pretty much sets up a cloud codelike environment that you can run in your own application in the back end. So I've set up a quick system prompt over here saying that it can s it can browse through this knowledge folder that we have over here. So I put just some markdown files in here for it to go through and then the allowed tools. So it can read, it can glow, it can grab and then there are some MCP tools in here. So I c So you can also add your custom tools to this which is pretty much just a tool specification through which you can call an API. So we can bring all of that together. We can decide the output format. So we can also even for structured output we can set a max budget and then we can pretty much kick this off. So the goal is not to go line by line through this. If you want to understand how it works like go to this file and you can go through it. But I want to show you what this looks like if we then go to a terminal and we run this. So if we now start up the agent number four and we take the customer surfy request to it. So we pretty much ask it a simple prompt like hey here's a message from a customer go through this. You can now see that with just a few lines of of code really it's going in this agentic loop. So it's using all of the tools that it has access to. So you can see it's literally doing a glob pattern like what do I have access to? So I can see all the files. Let me read them. It's going through this. And then here you can see it's reading the refund confirmation. mmd. So here you can see that we're this is getting pretty agentic. And you could also see why this is dangerous to put into a production system. Right now I the permissions are pretty much open. You can definitely like make it more safe. You can put it into a container. You can say hey don't go on the internet. You cannot change any files right now. This is pretty kind of like yolo mode. And here you can see it created all of the artifacts, so the actions to take. But this is really where I see the future of Agentic Systems is heading. It's not only the models getting better and uh and and being to able to reason with more tools and more context, but it's the agent harness around it that is so important. And that's why uh cloud code for example is so freaking good right now. It's because of the agent harness.

Level 5: Multi-Agent Orchestration

So that's really an important area to like look into as an AI engineer to understand the different types of AI harnesses that are out there and to learn from the patterns that they are using. And an example of this like the pimono this is all open source. So you can literally just look into the tools here even though it's TypeScript. I assume most of you are watching are Python engineers but you can just look at the patterns or just ask lot code to create a Python version of this and you can start to understand how all of this is coming together. So if we then go even one level deeper, that's where we get into the multi- aent orchestration and I have one more example for this as well because one of the cool things that's already embedded in the cloud code agent SDK is if I come back to let's see the agent uh sorry over here to the setup you can see that you can even add this parameter of agents in here. So an agent then follows the agent definition but it has like another way that you can describe a prompt. You can give access to tools and you can also specify a model and the difference really between this is that each of the agents that the orchestrator decides to call on has a separate context window. So this is the big problem, right? Especially when you're doing longer tasks over a longer horizon. You'll maybe you may need to search through a knowledge base. The agent may need to dig into something and then it bloats the context uh window and then once it's found the answer like you may be at like 70 to 80%. If you use this pattern with the sub agents and the orchestrator through how they set it up in the cloud agent SDK, they all get a separate context window. So they kind of like spawn a separate version of cloud code, a separate window to then go off uh in again, do research and then report back to the orchestrator. So the context of the orchestrator stays clean. Now again, this is not something you can only do with the cloud agent SDK. You can also do multi- aent orchestration with pidentici with lang graph. There you have a little bit more control over do you actually want to share that context window or do you want to start fresh but the claw agent SDK just makes that super easy to set up and also fun to experiment with but very early. So like I've said we're not using it that into any of our production systems right now and testing with it internally because it's in some cases unreliable. It can also get pretty

Final Thoughts on Complexity Levels

expensive. So where does that put us right now? So here in this final graph I've combined all five of them. So we have the augmented LLM, we have the directic asylic graph which is the entire thing tool calling the agent harness and even the multi- aent orchestrator. And then what you should remember as an engineer as always use the simplest level that gets the job done and combine them. You can combine them. It's not like one way or the other, but just starting with your DAXs and using simple augmented LLM nodes in those that is still the bread and butter of reliable AI engineering. We've seen that now LLMs the models are capable enough to use tool calls in edge nodes. that's totally fine. But only use them when the complexity grows and when you really need to because a deterministic almost deterministic like DAG is always easier to maintain and create unit tests around than an LLM node with five tool calls. And then here is one final table to summarize all of that. Right? So consider the costs, consider the latency. These multi- aent systems, they run long and they cost money. In some cases, it's totally fine. If you have a coding agent, you're okay. spend money, take 10 minutes, take as long as you want. If you can solve the problem, that's totally fine. On the other end of the spectrum, you have the simple the automations, the cheap fast automations that we just want to be quick and we want them to be more deterministic, right? So that's the overview. That's the full picture, all five combined. And we now strategically per problem that we're tackling decide what we need, but we combine them. We put all of them together into one system. All right. So those are the five levels of AI agent complexity and what's working in production right now, what we're using and where I see the future is heading with these agent harnesses. And now in my opinion, it's very important to have a mental model of these different levels of complexity. Not only to like reason about them when you're deciding what to use for your projects, but also when new tooling comes out. So for example, like last month we had all the hype with OpenClaw, right? And what always is very interesting to me is that most people have simply no idea what the abstractions are underneath a tool like OpenClaw and what it's built on. So they think most people think now we have OpenClaw. It's this entirely new AI paradigm where all of a sudden it can do we can do all we can automate everything because of OpenClaw. And now don't get me wrong, OpenClaw is an awesome project. like it's well built. It's very cool, but it's pretty much building on all of the underlying principles that we already know, right? Using LLMs, giving them tools, having the right system prompts, and now it just took this principle from the coding agents like cloud code. So, also being able to create files to crawl the file system into search files. And now all of a sudden, you could hook that up to something like WhatsApp. And you have this very magic system. you as an engineer now understand and can break it down so that whenever there's a new technology or a new trend or a blow up in AI where people say everything changes you know what we have every week you can just see through that and see ah this is just another agent harness we have an LLM we have a bunch of tools we have files and we have prompts because that's all it is all right and with that

What I've Been Working On

we have come to the end of this video and I also want to give a little bit of an update as to what I've been working on because if you've been following the channel for some time already. You know, I haven't been as consistent with YouTube. I like to be more consistent, but it's tricky. You know, most of you probably kind of like when you think about me, you think like Dave Ear, the YouTuber, right? But for me, behind the scenes right now, I'm actually running three businesses. And YouTube for me is just kind of like the fun part of it, just sharing what I'm working on. But sometimes it just gets too busy. And the three businesses, I'm not running everything on my own. And of course, I have co-founders for all of the different businesses, but it's still a lot going on behind the scenes. So, with YouTube, I'm always trying to find, okay, look, when do I have time? But also, what do I really like want to share? Because again, if you've been following me for some time already, you know, I like to stick with the foundational stuff and don't really get into the hypy things, right? Here's open claw, it changes everything. Here's what you need to know type of videos. Like there are plenty of people who do that but like this is not really the channel for it. That's not why you follow me and come here to these videos. But that said then also if you look at to like the current state of AI engineering beyond kind of like the changes that I described in this video not much has changed for the past two years. Like really like it sounds crazy. AI is there is a lot of stuff happening but AI engineering principles the stuff that we share here is still pretty much the same. So, I've covered most of these topics already. So, now I'm also kind of like trying to figure out like look, what do I really still want to cover and one of the ideas is just to share more of the work that I'm doing behind the scenes. I think that could be really interesting. So, it's not as tutorial focused, but more so kind of like sharing the insights from that. But, I'm also curious. That's why I'm kind of like on a yap over here to kind of like ask you like, hey, what would you like me to cover? to see? What are you currently working on? what are you struggling with? What are the challenges? Right? So, yeah, that's uh that's one thing. And yeah, I've also I just came back from almost like a month trip to Cape Town where we hosted a huge AI event with some of the biggest creators in the space. I will pop up a picture over here. You'll probably recognize some of the faces on there. So, I was a month in Cape Town. Also, like I kind of like wanted to record a video there, but whenever I'm not in my office with all of this kind of like equipment over here, I'm always slacking. So, there's that as well. So, yeah, life has just been busy building businesses, uh connecting with people, traveling, and then unfortunately sometimes YouTube takes a little bit of a step back. But I'm still here for you guys. I don't plan on uh on quitting. I want to continue. I want to ride the AI wave at least until we have some form of AGI here. I want to be there. I want to share how we can work with that and how we can build systems around that, what that all looks like. But yeah, let me know if you stuck around till the end. I appreciate you. Thanks for watching and then um I'll see you in the next

Другие видео автора — Dave Ebbelaar

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник