You're doing Agentic chat history wrong | OpenAI Agents SDK
32:57

You're doing Agentic chat history wrong | OpenAI Agents SDK

James Briggs 24.07.2025 2 911 просмотров 74 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
Prompting is an essential component when working with LLMs, and Agents SDK naturally has its own way of handling various components of prompts. In this chapter, we'll examine how to utilise static and dynamic prompting, as well as how to correctly use system, user, assistant, and tool prompts to build event-based conversations, not interaction-based conversations. Then, we'll see how these come together to create advanced conversational agents that use chat history the right way. 📌 Code: https://github.com/aurelio-labs/agents-sdk-course/blob/main/chapters/01-prompting.ipynb 📖 Article: https://www.aurelio.ai/learn/agents-sdk-prompting Twitter: https://twitter.com/jamescalam LinkedIn: https://www.linkedin.com/in/jamescalam/ #aiagents #openai #ai #coding #artificialintellegence #programming 00:00 OpenAI Agents SDK 00:56 Agents SDK Setup 01:56 Static Instructions 06:03 Dynamic Prompts 08:38 Rethinking Agentic Chat History 11:09 Message Types 20:25 How to Use SDK Message Types 22:09 Developer Messages 24:31 Assistant Messages 26:37 Chat History 27:53 Function Calls 31:03 Conclusion for Agents SDK Prompting

Оглавление (12 сегментов)

OpenAI Agents SDK

Okay, so beginning with chapter one, we're going to be covering prompting. Now, I know prompting isn't the most exciting thing about agents in the entire world, but prompting is a core component of any AI system there. You just can't get around or avoid prompting. You have to learn it. And what I want to cover in this chapter is of course just the hands-on of prompting with agents SDK, but also I really want to talk about how you might want to think about the various prompts and the way of building a conversation with an agent in a very different way. And I think that will be very useful for a lot of people out there building with agents. So that part in my opinion is actually quite exciting and for sure very useful. So we're going to start in

Agents SDK Setup

the agents SDK course repo. We're going to go to chapters and of course we're going to chapter one. I would highly recommend that you go with the open with collab approach here. It's going to save you time rather than setting it up locally, but of course you can set it up locally if you prefer. I actually do hope set up locally, so that's what I'm going to do. So in collab, the first thing you'll need to do is just install open agent. So we're using 0. 1 as you can see here. Then if you're running this locally, of course, you should have already set up your UV environment. And you would just be clicking up here and clicking here to select that UV environment. Then we're going to come down and first thing we're going to do is set up our openi API key. Now you would get this from platformi. com API keys. Once you have it, run this cell here and we'll get a little text box pop up locally at least in cursor and VS code it's going to be here and in collab it will be just under the cell. So now we're going to take a look at the

Static Instructions

first type prompt which is the most basic. It's just a prompt and it is the static and it is static instructions. So this is when I say instructions here for those of you that have been using agents or LMS think of instructions as your system prompt because that is exactly what it is. So the instructions are the system prompt. They guide the behavior of your agent and we require them here. Okay. So in these instructions, we're telling our agent to speak like a pirate. And this is how we would initialize our agent. It's very simple. You have the name of the agent. You have the model that we'd like to use, which is GP4. 1 mini. And it is pretty simple. So we'll run that. And once we've defined our agent, we want to run it. Now the way that we run an agent is I suppose it would be it feels a little more complicated but it's still very simple and it makes sense. So there are these various runner objects in agents SDK. There are different runner objects that we you would use in various scenarios. I think for the most part you're going to be using this runner object. the others that you might be using are for like synchronous execution, which it would be kind of weird if you're building an AI application and writing synchronous code. In general, I don't think I've ever been on a project where we haven't needed to use async code. So, in general, I would say yeah, you can use synchronous code and methods. And I know that might be simpler if you haven't used async before, but I would learn async as quickly as possible. And it's very simple, especially in the when you're using these libraries. So in this case, all we need to do is okay, this runner object that we have down here, we're going to run it. This will run the async execution of our whatever we set up in our job here. And because it's async, we need to await it. Okay, so this is for the most part. This is all we rarely need to modify in many cases. Sometimes it can get a little more complicated. Depends on what you're doing. But anyway, in this case, all we need to do to run our async job is to await it here. Now there's a little more going on within this job. As you can see, we have a starting agent and we also have our input. The input is the user query. Okay, that is your user prompt that you may modify this in some cases. You can have multiple messages for example, but in this case it's just a user query. the starting agent here. The reason that we even have a signing agent is because a agent may also be able to hand off the task to other agents. And in that case, it makes sense that you would call this parameter here the starting agent because maybe our agent here can hand off to multiple other agents who may be able to hand off to other multiple agents. And of course, in that case, we're not just using a single agent. So this is why this is called the starting agent. It's where we start. However, in this case, of course, we just have a single agent. So we're just running our single agent. So we can run that and we'll get some output down here. Okay. And what we've asked Okay. So we've our system prompt, our instructions have told our agent to speak like pirate. We've come down here and then our input or user query has said write me a haiku and that is exactly what it has done. Okay, so that's great. It's working. Now let's move on to some

Dynamic Prompts

slightly more dynamic instructions. So this again is nothing complicated here. What we're doing is for example, as we'll see in a moment, let's say we would like our agent to be aware of the current date and time. In this scenario, we might want our instructions to include the current date and time. But of course, we don't precreate that prompt because then that prompt will include the current date and time from when we created the prompt. So what we need to do is pass a function to our agent that can be called whenever it is being used whenever our agent is being used and it can get the actual current date and time. So that's what we're doing here. We have this timebased instructions method. We have to pass a context which is this run context wrapper and we have to pass an agent here. So in the back end when the agent method or job method is running and calling this timebased instructions function it's going to be passing in those parameters even though we don't actually use them in this case. In other scenarios you actually might but in this case we don't. So here we're just saying okay the current time is time if it is the afternoon speak like a pirate otherwise do not. Okay, so right now it is just the afternoon. So it will probably speak like a pirate. So let's run that. We initialize our agent here. This is now the time agent. And we have our time base instructions. And we can come down here and I'm going to say hello. What time is it? And we'll see. Okay, it is in the afternoon. Now we can modify this just to confirm. Okay, is this working? I'm going to say if it is later than 1 p. m. speak like a pirate otherwise do not. Okay. Reinitialize those. Come down here. Okay. And now we can see that it is not speaking like a pirate. So yeah we have that. We can modify that. Of course, it can be, you know, these dynamic instructions can become much more complicated than what you see here. Okay.

Rethinking Agentic Chat History

Okay. So, we have that. Now, I want to talk about message types. And this is where I want to be very specific about how you should think about a conversation with a agent. Because the default thinking here when it comes to speaking with agents is that your conversation or chat history with a agent is a set of messages. Okay? It's like a set of back and forth like you would have with a person. Okay? It's like I say something, they say something, so on and so on. But I think and this is from years of building agents. I think the better way of thinking about chat history and interactions is not to think of them as chat history or interactions. I think the better way is to think about this as a sequence of events. This sequence of events does not need to be user assistant user assistant. It can be many combination of various things. And when we think more about these as events and the execution of these events, it becomes much more obvious in my opinion that an agent is more of a workflow execution engine that goes and performs different tasks based on particular triggers, right? Those triggers might be a user sending a message, right? That's the traditional approach, but it could be something else, right? You could just be doing something on your computer. You maybe you open a particular window and there is something some event that gets triggered when you open that window. Maybe it's your browser that goes off to an AI agent and tells the AI agent to go and give a summary of the current weather. Okay? and it will go and do that and oops it's going to come back right that's an event it's not a user you know user is triggering that event they might not even realize they are triggering that event but it's not a user interaction okay so in my opinion it's best to think about the interactions with a agent as being more a log of events okay and with that in There are five primary message types from OpenAI. So here we have our five

Message Types

message types. We have the developer message. This is our new system message. So OpenAI recently renamed this. So rather than calling the system message or system prompt the well that they are calling it the developer message or developer prompt. So, you know, just one of those complete switches of what OpenAI are calling various things for some reason. I don't know. So, for now, okay, this is now the developer message. I don't know if they could change that back because I can't imagine that change propagating across the wider AI industry, but maybe it does. So think about this as either the developer or system message. So we would have our developer message and that is what we have up here. Okay. So typical instructions again like more terminology that OpenAI is not being consistent with. This is also in agents SDK. This is our these are our instructions. Okay. instructions, developer or system message as you prefer. You can choose any of them apparently. So these instructions are where we instruct our agent on what it should do, how it should behave, what it can or cannot do, right? All of that information is in here. Then we have our user message. Again, as I mentioned, this is this could be a written message. It could be a event trigger. It could be anything. Okay, so it's good to think about these as events. So in this case, this is a typical chatbot scenario. I am a user. I'm going in. I'm saying, can you help me learn about OpenAI's agents SDK? I'm using Python and I would like to understand what library does. Okay, that's my question. Then what's going to happen is this is going to go to LM. So these instructions here, our developer prompt followed by our user prompt. This is going to go to our LM. Our LLM is going to generate this here, the function call. And look, it could use a function call or it could just go straight ahead and jump into an assistant message. It depends on your how you've prompted things, the tools that you've got and everything set up here. In this scenario, we're going to assume that we have a rag tool. Okay, we're going to assume that this rag tool will allow our assistant to go and retrieve information particularly about the open AI agents SDK, right? Or it could be something else. It could be I don't know like a big encyclopedia of various AI libraries or AI everything, right? It's a essentially a kind of custom search engine, okay, that you've set up yourself. So you put whatever in there really and we'll be talking more about rag later in the course as well. So we have that function call. This is just to be very clear this is generated by the LM. The assistant message is also It's just the structure is slightly different. So in some sense I like to think of function calls as being also an assistant message in some way but it is generally not referred to as a assistant message even though is generated in the same way just in a slightly different format. So function call is always going to be a dictionary or JSON object and it's going to include tool name the tool call ID. This is unique and very important. And then it's also going to have the tool args. These are the inputs to a particular tool. Okay, a tool is just a function. Okay, so like a Python function. So we would have you know we would have something which is called the this is supposed to be defaf um defining a function here which would be rag tool and one of the parameters of that function would be the query parameter okay and our lm will generate a search query here okay so this is not exactly what the user has written it is a generated search query that our function which is the rag tool function is going to use to search for some relevant information. Now in between this function call here and the next function call output in here we are calling like we on our side we call our rag tool. It goes does stuff it goes and does some stuff gets our information and returns it to us. The information it's returning to us is what we get in this function call output message or event. And you can see here I've in this example I've shortened it down because we obviously don't have much space. I've said there's two bits of information being returned. Okay, one of them is from this doc abc. Another one is this doc ghi. Okay, so it's coming from two documents. And the first one of those is very introductory information about OpenAI agents SDK being ideal for developing agentic apps. And the other one is specifically focusing on voice which is a built-in feature of the SDK right and then there would of course be more information over here but we don't have much space so shortening that down. So we've got some information from our rag tool. Then that is going to be passed back to our LM alongside all of these other events up here to generate that final assistant message. Okay. And then that is what we would return to the user. So that is again returning to that chat uh component. So all of these like pink components here, they're internal. Okay, they are either events that are happening like these here or some setup that we've already done before even running anything. These white blocks are what in a typical chat interface the user would see. Now, there could be multiple other things going on here. We might have multiple function calls, multiple function call outputs. We might even have assistant messages that are being created but are internal where the assistant is kind of reasoning and talking to itself and thinking maybe I should do this but uh maybe I shouldn't because of this other thing over here right that sort of thing can be happening it depends on what sort of workflow you've set up in this scenario it's pretty simple so that you know it would look something like this when we're just retrieving information from a rag tool and returning that to the user. So we have those five message types and I'll just clarify that the user developer and assistant messages are actually all of the same message type which is type message and these this single message type is actually differentiated as having various roles. So the role is either developer, user or assistant. Now most of this detail we don't even really need to know to use agents SDK. It abstracts away quite a lot from us. So we initialize our agent with the instructions parameter. We send messages with the input parameter but beyond that we don't really need to use directly any of these other items. But it's of course important to know this if you're developing and wanting to get good at developing AI systems and especially if you're thinking okay I'm going to use Agent SDK but maybe in the future I might use another framework or no framework at all. It's important to know this sort of thing and even just to understand how your system works and it's also worth noting that for example if you have a an application that you're building where maybe users are coming back to your application after you know some time away from it and they're wanting to continue a conversation. It's just one scenario. In that scenario, you would need to load all of your previous interactions and of course format them in the correct way for the agents SDK. Right? So in that scenario, you would actually have to understand, okay, this is a assistant message or a message with ro assistant. This is a function call. call output and so on, right? You need to understand what those are so that you can use them later or if you need to manipulate those in any way like manipulate in a good way like you need to add some intermediate system message for example. Now let's

How to Use SDK Message Types

take a look at how we actually use each of these measures in agents SDK. So we already saw this one creating a user message. It is just when we're running our job, we have our input. This is a user message. So, we run that. Really simple. Nothing to really teach you there. If you'd like to use the types, you would want to use something like this. So, this is actually coming directly from the OpenAI library. And we are just importing the message object there. And this is just okay if you want to be a bit stricter about the typing which I generally recommend if you're building anything serious you probably should be. So in this case we'd create our message here and then we passing in our message within the list here. So I mean that seems more complicated but then you know again if you're building this broader application and you're just kind of passing a string around containing a user message rather than defining hey look this is a user message you know you're you're kind of asking for trouble. So I would recommend you just type everything as much as you possibly can. Cool. So we have that. Now other thing we can oh yes we can simplify this as well. So our user message can also just be a dictionary if we want and you'll just pass that in again like this. Okay does the same thing but again I would recommend going with types. Okay

Developer Messages

now developer messages again this used to be system messages. So that is defined with our instructions as we saw before. We can also define that directly using this with the system messages. In many cases, you probably wouldn't be passing those around as much as you would be a user message, but yeah, you might also want to type those if you can as well. Okay. And then we have developer or system messages that might actually be inserted, you know, not just as the first like initial system message. So in this case for example we might want to for whatever reason right I I'll give you a better example in a moment but for whatever reason in this case where we have instructed our LM to speak like a pirate and we're saying write me a haiku for whatever reason maybe we've you know something gets triggered and it looks like the user actually does not want the system to speak like pirate and instead they should use obvious British slang. In that case, we might insert a system message here. Let me give you a better example though. Okay. So, in this example, right, going back to our earlier visual here, in this example, what we might find is that we would like to ensure that our assistant is always quoting where information is coming from. And in this case, what we can do is we can say, okay, whenever you use the rag tool, we're gonna have some logic in our system, which is then going to go into here, right? So, we're here and we're going to insert a developer message which reminds our assistant to always use citations to always use a particular format and to never say anything that it doesn't know from the context provided. Okay. And that can be a really very strong way of ensuring certain behaviors at certain points in our agents. So that is another scenario, more realistic scenario where you might want to be inserting developer or system messages at various points within a conversation.

Assistant Messages

Okay, so we have that. Let's move on to assistant messages. So again, these are typically our direct responses to the user. The content field in this scenario, the content field of a message is going to be generated by the LM. And it might look something like this. Okay. So, we have our assistant ro here. Then the content is going to look like this. Okay. And we can actually add that to our what is becoming our chat history now. Like so. So, we would have our original user message. We have that developer telling us to ignore the instructions and do something else. Then we have our assistant message here. Okay. So, this is the assistant role. And I'm just going to say, okay, this is my user message. Now, I'm going to say, can you repeat what you just said? Okay, let's just ensure that does in fact work. And it looks like it does. Now, the output from the result there. Okay, let me even show you. So, our output is not quite the format that we need for feeding back into our next input. So what we can do is actually we can take this these results here and we can use two input lists to convert them into the format that we would need. Okay. And you see this has been modified a little bit. It creates this list of dictionaries which looks a bit more like what I was showing you before. Uh and okay one additional thing that we get here is we get all of these additional optional fields that are returned from OpenAI that we didn't populate. We could populate these if we wanted to, but we, you know, we created these on our side, so there's no ID that open air has created. The content field, you can see, is also far more complicated. We have these annotations. We have the type log props here. We don't really need all these. The only one that we truly need is this text here. We have the status here. So, it's obviously completed. And this is of course of type message, which I mentioned before. Okay. So now let's

Chat History

take a look at how the agent or runner maintains conversation history. So first I just want to point out that when we do two input lists here we'll only see that most recent message. So it doesn't seem like this is maintaining the history and we can just confirm. Okay. So could you give me another? We're just going to say what were we talking about? Okay. And we see like okay now it's clearer we be starting fresh on this here voyage. There'll be no previous parlay to recall. Okay. So we can see that whenever we want to add the chat history in like this we need to actually pass it in right. We need to explicitly pass it in. It's not being maintained by our agent. the runner. Okay. And it's not necessarily clear immediately whether that is the case or not. But you know that is the case. So worth just making you know remembering that because you will of course need to implement things differently based on that one small little thing. Okay. So we can move on to

Function Calls

the function call messages. So those look like this. So we have call ID, the tool or function name and the arguments. Okay, so let's say we want to construct a function call where the function or tool will be get current weather is called and the single input parameter here will be the location of London. Okay, and this is what it would look like. So we have the type which is function call. We have the call ID. Okay, so this is unique. This is important that we have this. Okay, without the call, without the tool call ID both here and in the next message, OpenAI cannot pass our history, our chats, our interactions. Okay, so it's important that we have this. Then we have the name. So this is a function that we're going to call. And then we have the arguments for that function. So I think this is just a keyword arguments. So there will be a parameter in our function which is location. and we will provide the word London to that. So let me come down here. We're going to add a function call in here to our history. And well, let me show you. Okay, we got this error, right? And this is why I mentioned before that tool call ID is very important because with every function call or tool call, you need to have the response for that function or tool call in your chat history. So you actually need two messages here. There must always be a pair of messages when we see that call ID field. So this here what we've just done and you can see here this is not valid. Okay. Developer user tool call that's not a valid set of interactions. Whereas developer user tool call tool output that is valid. Right? Then just to be very clear here. Right here we can see that the call ID for our tool call and tool output is the same. Here it is not for the tool call the call ID is call 23. For the tool output 456. This is invalid. This openi won't process this. Okay. Your tool call and tool output need you need to have pairs of those messages and they need to have the same ID. So that's important to know. We'll cover more on that in the chapter on tools. So we need a function call output. We're going to say this is a type. It's a function call output. The call ID here is call 1 2 3. And the output here is of course it's London. It is raining. Okay. Now we can try again. Let's see if it works this time. Now that we have our pair, we have our function call output. Okay. Right. And we can see this is correct. So the obviously we didn't actually use a tool here. I'm sure it probably is raining but in any case we didn't use an actual tool here. So it's telling us okay London today is raining. So that

Conclusion for Agents SDK Prompting

is it for this first chapter on prompting and I think a lot more in agents SDK. We've covered a lot of things of course just how prompting works in agents SDK. The many different ways that you can call a system prompt. Now you can call it instructions or develop a message as well. If you want user messages as well are just as confusing or now user message or input. But then input can also mean a many input messages. You can use any of those as you prefer of course. But what we actually get from all of this is just in reality there's a lot of structure that is behind all these various prompt types, these different message types, the way that we do function calls, function call outputs, all of that. And it's important that we as engineers using the agents SDK are fully aware of this and fully aware of in the back end how all of these things are structured, right? So that we can ao avoid issues like what we just saw at the end there where we don't have a pair of function call and function call output. And it's also very important for us to be aware of how, you know, we can structure this chat history and maybe think about it as less of a chat history and more as a stream of events. So that is it for this chapter. We'll move on to the next one. Thanks.

Другие видео автора — James Briggs

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник