# Build Agentic Ecommerce with KumoRFM

## Метаданные

- **Канал:** James Briggs
- **YouTube:** https://www.youtube.com/watch?v=MFp9vjr6rgA
- **Дата:** 17.09.2025
- **Длительность:** 49:34
- **Просмотры:** 2,266
- **Источник:** https://ekstraktznaniy.ru/video/20592

## Описание

In this video we explore the use of agents and LLMs for ecommerce and develop our own advanced agent enabling advanced data science and analytics for ecommerce.

We use Kumo AI's Relational Foundation Model (RFM) to produce insanely high quality predictions super fast, enabling a conversational experience with what is essentially an expert data science agent.

📌 Article: https://www.aurelio.ai/learn/ecommerce-agent
📌 Notebook Code: https://github.com/aurelio-labs/cookbook/tree/main/gen-ai/agents/ecommerce-agent
📍 AI App Repo: https://github.com/jamescalam/ecommerce-agent

💡 KumoRFM: https://bit.ly/47x3WSk
📊 Kumo AI: https://bit.ly/4gduL04

Twitter: https://twitter.com/jamescalam
LinkedIn: https://www.linkedin.com/in/jamescalam/

#datascience #machinelearning #python 

00:00 Ecommerce and AI
01:46 Agents for Ecommerce
02:34 KumoRFM
03:24 Talking with a KumoRFM Agent
10:58 Using KumoRFM
16:51 Making Predictions with KumoRFM
19:02 Agentic Predictions
22:34 Query Dataframes Tool
25:23 Quer

## Транскрипт

### Ecommerce and AI []

Today we're going to be talking about two different fields and the intersection between them which is e-commerce and modern agentic technology. Now e-commerce is not new to technical innovation right in the field of machine learning and later AI e-commerce has always been a very key player and it's always pushed the field forward. And I think there's probably two big reasons for that. Maybe many more but two big reasons. E-commerce as a industry just has bucketloads of data and second there is the possibility of very big financial rewards to people that do well within that field. So you have tons of data, tons of money. You put both those together and you get this very quick advancement of more like data science applied to the real world. Now, what's interesting is that the latest wave of AI that everyone is using is used a little bit in e-commerce, but it's used in the typical way, right? Put a chatbot on your site, get it talk to people, maybe put in a little bit of rag there so it has access to your FAQs or a bit of your product database. But the things that made e-commerce such a good place for machine learning and data science in the past, i. e. lots of data and most of it being structured data, those two things actually make this not so useful for the latest wave of Genai. Okay? Because LMS are not that great at passing through a ton like really large amounts of data and they're not that great at passing through structured data. So what we're

### Agents for Ecommerce [1:46]

looking at here is LLMs are not that great with these huge amounts of e-commerce data. But interestingly, if we have these more traditional algorithms that have been using e-commerce for a while, if we have these more traditional algorithms used as tools for LMS or agents, then things start to get pretty interesting. So with agents, our LM will not need to interact directly with those that huge amount of data. Instead, we put in some more traditional algorithms, although they're not super traditional, but more traditional, more proven algorithms for e-commerce and more predictive querying as a whole. We put them in as tools for our agent to use. And this is where

### KumoRFM [2:34]

Kumo's relational foundation model comes in or Kumo RFM. So, Cumar RFM is somewhat of a merger between the more generative LM technology and something called graph neural networks. And graph neural networks are really good at mapping relationships in data and understanding very large amounts of data and very interconnected types of data. And what we're going to be doing is taking a quick look at Kumo RFM itself and just seeing what we can do with that. And then looking at how we can integrate Kumo RFM with a fully functional agent that will allow us to just ask all of these questions of a really big e-commerce data set and really dig into it. It's super interesting. And actually, let me show you exactly what I mean. So, this right

### Talking with a KumoRFM Agent [3:24]

here is what I'm going to be showing you how to build. Now this is a fully functional chat application. We have an agent in the back end here that we have built that has access to Kumo RFM as one of its tools and the stuff it can do is honestly incredible. So let me just first say hello. I want to be polite. So straight away okay how can I assist you with H& M e-commerce data set which is what we are using. I'm going to say, can you give me some of our most valuable customers? Now, this query does not need Kumo. Cumo is more predictive things. It's predicting the future. Okay, this is just okay looking at our data. Who are our best customers? Okay, so we actually have a couple of tools in our agent here. One of those is this query data frames tool. And this is querying Panda's dataf frames. So it writes some code. You can kind of see in here. It's not super cleanly organized. And it's going to get out some customer data. Okay. And that's what we have here. So we can see, okay, customer at the top here, their total spending within our sample data set is this much. So we're going to say, okay, customer zero, looks interesting. And we're going to approach this from sort of a analytics slashmarketing perspective. So what we want to do is looking at our customer data here. We're going to put together an email campaign. Okay, email marketing campaign. It's going to be very targeted, very personalized because we can do that with this and we can you know what we're doing here. Obviously, we're chatting, but this sort of thing can be put into a script and we can just run it over all of our customers, right? So, we're going to continue and say customer zero looks interesting. Can or what is So, this is where Kuma is going to come in. What is the probability that they will purchase something from us in the next 30 days? Okay, we've got cumo RFM being used here. You will sometimes see it's being used a couple of times because this is the LM is writing its own queries here. So, it usually gets it wrong before it gets it right. But we've given it these guidelines that allow it to figure out pretty quickly what the right way to write a query is. So here we have the probability that customer will make a purchase in the next 30 days is approximately 70. 2%. So pretty high, but I'm going to want more information if I'm putting together this marketing campaign. So I probably want to say, okay, great, that's amazing, but what are they going to purchase? So, what do we think they are most likely to purchase? And I just want to be very clear, this information, like this information here, that's not in the data set. This is being predicted by Kumo on the fly as we're making these queries, which is incredible. This is sort of thing as a data scientist you'd be working on for a while like trying to get all this together and now I don't even need to know what's going on. I'm just like okay what? Tell me more. Like I'm just exploring which is incredible. So what do you think they're most like to purchase? This query here we can just show you here. This one did fail. tells you why it t tells the agent why it failed and then the agent fixes the query. Okay, so this is what we're looking at here. So it's predicting distinct transaction the distinct products or product IDs. It's looking at actually just the top one here for that particular customer. So we're getting the top product here which is this loose straight HW con whatever that is. Okay, we have the details here. Those are five pocket jeans in wash denim with high waist. Okay, it was pretty cool. Um, but that's not maybe I want more there, right? So, I'm going to say maybe we can see a few more of their most likely purchases. Let's go with five. Okay, we'll use these to write our email in a moment. Okay, that's pretty cool. All Okay, we have a dress, some mostly jeans from the looks of it, and then a dress as well. That's nice. So, okay, let's tell the agent what we're actually wanting to do here. So, we're working on a highly personalized marketing campaign and I need to write an email to this customer. We want to offer 10% discount on their most probable purchases in the next month. Let's write email. Okay. So, we'll just write this and we'll probably go through a few iterations to make it, you know, what we want it to look like. So, yeah, already pretty like just off the bat. This is super personalized. We have Yeah, we notice your excentation styles like our loose straight HW cons trousers and other denim favorite. So, you know, this is where the LM side is coming in here. just synergizing all this information that we have and just putting it together in this really nice uh email. But I'm going to say I'd like to list out their top five most probable purchases and I want to include images in the email. And I'm going to just make this up. This doesn't actually work. We do this by writing and I'll just put in some random syntax here. We'll say article and then article ID. Okay. So, we'll do that. Yeah, let's go. Let's go with that. So, now we can basically confirm that it is getting this information very specific information here. And that's cool, right? So, we have that. And I think how quickly we've put together this email here. And again, like this is just one customer, but of course, this script that we've built here, very easily, we could just put this together for multiple other customers and just run it automatically. Maybe we do some spot checks here and there, but for the most part, this can be, you know, automated like crazy. and the predictive ability of these of Kumar RFM with the ability of an agent to build this sort of thing here and just have a conversation with it and build these emails is insane. So that is the demo

### Using KumoRFM [10:58]

but I want to show you hands-on how exactly do we build all of this. So we're going to jump across to the Aurelio cookbook repo here. I'm going to go genai agents e-commerce agent and through to the notebook here. So we're going to run this over in collab for now. You can also run it locally and in that repo you probably saw there is a UV setup. So you can set up the environment with UV and here we go. So for collab of course we're not doing that. So we're actually just running this cell here. This will install everything we need in Collab. You might get some warnings here about iPad kernel. It's fine. So, first thing we need to do is we need to set up Kumo. So, I'm going to go ahead and just to be very clear, there's like two Kumos almost. So there is the core Kumo like enterprise product and then there's this more lightweight but really fast Kumo RFM which is a foundation model that they've built. So we're using the foundation model which is slightly different and to be honest much easier to get started with. It's really good. So first we're just going to authenticate ourselves and we just need to click down here generate API key. I'll get a little popup and you will probably need to create an account and everything there. Once you have that, we're going to go ahead and initialize our home client. And you should see this. And then we can jump down here and we're just going to pull in our data set. So, we have a sample of the H& M e-commerce data set. Oh, you may need to just run this quickly. Interesting. Thanks, Google. And I think I just delete that now. So you can use the full e-commerce data set, but it is really big. So download times are pretty crazy. And yeah, I mean you can use it. In fact, I built most of this with the full data set before I realized it's probably a lot better. I just use a sample, but you can do either. The sample that we do have here is more of the is the more interesting parts of the data set. those a lot of active customers, customers that are making a lot of transactions or actually the other side where they've just kind of like fell out. So this is in my opinion a more interesting sample of the full data set which is good for playing around with. So what I'm doing here is downloading that full data set from hugging face which you'll find on my profile here. So this is the H& M sample data set that includes at least on the customer side it's 1. 1,000 customer rows and then there's I think maybe like 100,000 transactions in there that are paired with those customers and a few beyond that as well. So there's still a fair bit in there. Uh you can actually see here. So sorry transactions is 15,000 and then we have articles which is products that is 5,000. Okay. So you see all the information from our data set there. Uh this is yeah Google wanted me to enable custom widgets and then we can go ahead and convert our data set object set into pandas dataf frames. And the reason we do that is for integration with Kumo. And actually, I just messed that up. I jumped ahead too much. Let me rerun these and just run this. Okay. So, I've got my pandas data frames here. And you could do, you know, normal Panda stuff on those. So, if you wanted, you do customers df. And you can see the head of your data. But the main reason we do that is because we can use RFM's local table abraction here which is as I understand a thin wrapper around Panda's data frames that just allows us to pull out particular information from that data frame and organize it in a way that's more friendly for Kumo RFM. So we do that and we use this infer metadata method here to automatically infer the type of data that we have within our table. So basically set up a schema that QRM is going to be reading and you may also need or want to manually modify some of these. So Kuma RFM uses these semantic types. So it's looking at okay customer ID what is semantic type it's an ID age it's numerical so on and so on uh also primary keys important as well and time columns if you want to do any temporal queries so we set all of those then what we're going to do is create our RFM graph so what this is doing cumf is a graph neural network and what we are setting up here is we're telling QMRFM pay these different tables that we've already defined they are connected with these foreign keys. So that's what we're doing here. We're saying transactions is connected to customers via the customer ID column or key and again transactions to articles via the article ID column. Okay. And then with that we are on the cuma RFM side but we're set up. So we initialize

### Making Predictions with KumoRFM [16:51]

the model there and then we can go ahead and actually start making some predictions. So we use a slightly modified version of the Kumo PQL syntax. It's very similar, but there's just a few things to be wary of. But basically Kumo, both the enterprise product and Kumar RFM have what they call this predictive query language or PQL. And that's what we're doing here or that's what we're using here. So we're predicting the sum of the price of the transactions over the next 30 days from like now for a particular article for a particular product. So uh our product ID is here. Let's see what this prediction gives us. Okay, it will take a moment as Kumar makes a prediction although that was insanely fast. So it's looking at a ton of data and then making our prediction. Okay. So what is our demand for this particular product? Incredibly low. So it's basically we're not expecting to sell anything here. That's fine. That is, you know, still useful information. So let's try another one. So now we're going to be looking at a couple of customers. So we get a couple of customer IDs. We have these two here. I'm going to say predict the number of transactions over the next 90 days for our customers. Customer one, customer two. So we'll run that. Okay. And what we're doing here is we're saying this is a classification. Sorry. So the classification here is we're predicting okay what is the likelihood that the count of their transactions over the next 90 days is equal to zero. Right? So we're basically saying how likely are these customers to not buy anything over the next 90 days. So this is predicting more like churn rates. So let's have a look at this. We have the predictions here. The probability of this is actually false. So it's false that they will not buy anything. So most likely they will buy something over the next 90 days. And that's all really

### Agentic Predictions [19:02]

nice. But if I'm working in analytics or marketing and I want to be really fast here, I probably don't want to be writing all these queries all the time trying to figure everything out. Instead, I can make this really easy for myself, which is what we did in that chat demo earlier. And the way that we do that is we create an agent and we give these this ability to query Kumo. We give that ability as a tool to our agent and then the agent will figure out how to write queries. In fact, we give it a guide so it knows how to write queries uh better than for sure I can. So we're going to go ahead and do that. The first thing we'll need there is an OpenAI API key because we're using OpenAI for the agent part for the external agent part. So I will run this and I'm going to enter my API key. Here we go. And I'm going to just confirm that this is working. So, yep. Using uh GT 4. 1 mini here. GT5 is out. It has been for a while, but I find those models to not be they're just slow. So, and I don't see much benefit to them right now. So, I'm generally sticking with 4. 1 for most things. But yeah, I just said, tell me something interesting about graph neural networks, which is obviously what Kumo is using. And yeah, that's working. So we're streaming all of that information out there. That's cool. So open eye working. Now what we want to do is set up our agent. Now to do that, we're going to be using a no framework AI framework that I built which is called Graph AI. Now it's intended to be a no framework AI framework. And the reason for that is there's a lot of frameworks out there that are just very bloated. The idea behind graph AI is that we are just providing you with graph execution logic and then everything else you build on top. So all the API calls and everything that we need to do, we do that ourselves. But what that means is that we can actually write pretty simple code most of the time uh by having less abstraction. So we're going to go ahead and use that. So the first thing we need to do is define our tools. So we're going to have two tools. Now, graph AI, of course, we're building agents as graphs. And I just want to show you what the graph will look like. So, it's actually very simple. We are going to have the start node over here that is going to go over here to our LM. And our LM is going to have access to two tools. Okay, there is the query dataf frame tool. And what you can do is query Panda's dataf frames. the three data frames that we have already defined. So that transactions, customers and articles data frame. So we have those. Then also that LM can query Kumo. So basically what we just did with the predictions, the LM can write queries to both of these. So it can look at our data. It can pull out you know interesting things like a sample of customers or what articles a particular customer has purchased and it can go to Kumo and say okay what are the most likely next purchases for this particular customer or what are our predicted best performing products over the next 90 days. We can do all of that. So the LM will be able to use both of those tools and then we just have an end node. So when we respond to the user and all this will be streamed. So very

### Query Dataframes Tool [22:34]

simple. Now let's see how we actually define all of this. So we're going to first define our tools. Okay. So we have the query date frames tool. We have some context in here. So this is our tool description that we provide to the LM. And we also set up the tool schema here. Okay. So that is the structure that is that's a function schema that defines tool for the LM. But then we also need to have the function that will be called when we need to use this tool. Okay. And that's what we have here. Again, it's not particularly complicated. So we set up a namespace that we're going to query against. So that nameace includes our data frames here. Also includes pandas already imported there. So that the LM can just directly use like pandas methods or functions directly. we get the query from our LLM here. So the query is going to be some code, Python code that the LM has written that it needs to execute against our data frames. So that's what we're doing here. Clean it up a little bit and then we execute it within that name space. Then we're going to pull the data that is being stored in the out variable out and we use that as the response back to our LLM. So this is a tool output and we also stream that out as well if we want to use it or see that like we did in the chat actually. So in the chat if we come up here we can see that we had this input query and then we also had the output right. So we're streaming all this stuff as we go. Then we want to be able to query Kumo RFM. So for that we are first going to need to provide some details to our LM because Kumar RFM is very new. So most maybe all current LLMs probably don't know the exact syntax for Kumar RFM although they might be aware of Kumo syntax but it does vary a little bit. So we can actually find the exact prompt that we're using or the guideline by going to that link. And this is a reference. Okay, so we have all of this information that is just telling our agent everything it needs to know about writing cumo RFM queries. Okay, so it's relatively concise, but there is of course a lot to cover there, but that works really well and I would say that's absolutely necessary for this tool to work. So we do get that, but because it's quite a big query, we actually don't put it into our tool description. We system prompt for the agent. So we'll actually come back to using that a little bit later. So now we're going to define a

### Query KumoRFM Tool [25:23]

second tool which is our Kumo RFM tool. A little bit simpler in the description here because most of that context is going to be passed into the system prompt. And then also for the tool itself, it's pretty simple again because we are just getting our PQL query from our GT4. 1 L and we are accessing the cumo RFM client and running predict against that with the query. Okay. And we return if there's an error so that the agent can correct itself. But we also of course if there is no error we return that as well up here. And that is just formatted into a output there and appended to our uh this events here is like the chat history for our LLM. Okay. So that's great. We have all of that. The final thing that we need to do for our LM to be able to read these tools in that we have defined is we need to convert our pandas base model schemas into what we call function schema that is openi compatible. So that's what we're doing here. We're using this function schema from paidantic to create a function schema object. So if I run that, you see the function schema that we create. Okay. So yeah pretty simple. Now building the graph. So again

### Building the Agent Graph [26:50]

as I showed you before we have that very simple graph. So start lm two tools end that that's all there is in this. So we need to define the nodes of that graph. We've already defined two of them which are our two tools. Now we just need to define the start lm and end. So let's do lm first. That's a little more complicated. So we define our function here. It's a router within graph AI which means that it can go in multiple directions which of course the L1 node can. And we're actually writing the AI API logic directly. Okay, we're not using abstractions here. So this is our chat completions uh that we're creating here. We pass in our chat history/events. tools which we defined up here in the correct format for OpenAI. We're going to be streaming the output. And then another thing just to keep things a little more maintainable and controllable here is we stop OpenAI from making parallel tool calls cuz we need to add extra logic to handle that. We can do that but just to keep the code simple for now, we're not doing it here. Okay, so we have that. We set up our actually once we once we've done this, our tokens are going to be streaming out from OpenAI. So then all we're doing here is essentially catching all those tokens being streamed out. And based on what we're doing, we're either, you know, we may be returning those to the user. Same here as well. We do that. So we handle each of our different token scenarios differently there. So for streaming the direct answer to the user for streaming the hey look this is a tool call. So there will be an event which is I'm going to use this tool and also here's the tool call ID which is important to maintain. So we handle that and then the other scenario is okay now I'm generating tool arguments and we also handle that here. So we have all of those and what we're going to do is we're going to come to the end of this call. So this is a single iteration of the LM. Right? So the LM is going to say I'm either going to use the query data frames tool. I'm going to use Kumo RFM or I'm going to answer directly back to the user. I'm going to go to the end node. Okay. So what we're doing there is saying okay if we have the direct answer that means we're answering to the user. That means we need to go to the end node here. Okay. Okay, so this choice variable here just controls the logic of where we're going. We also just append the assistant output, this assistant message to our events. We do that for also tool calls as well. So we have all of that in the chat history or event history. Otherwise, if we see that we have a tool call from this run of ELM, we handle that. Okay. And we go and we actually trigger the tool call of our the chosen tool. And that is it. That's our LM node. Right? So basically it's generating text and we're using what it generates to make a decision on which place in the graph we're going to go to next. Okay. And then we have the start and end notes and they're really simple. It's just boiler plate. Okay. So we have those. Now we have all of that. One thing that you might notice that we haven't done yet is we haven't set up our system message for our LM here. And that's because the system message is going to be stored within the events here. So it's just the first message in the chat history/event history. So we do need to create that. Let's go and do that. And it's relatively simple. Okay. So we're just giving it a little bit of context. Okay, you are helpful assistant using various tools in QMRFM to answer the users's analytical queries about the H& M e-commerce data set. Okay, here I just want to be very clear. Okay, how should you use the tools that are available to you? I want the LM to use everything it has to answer all the questions and not have me like prompting it to go again and go again. I want it to just get everything it needs and then come back to me. Okay, so that's what I'm trying to do here. Now, that being said, I don't want it to just keep going and keep going. So, one thing that I would be concerned about here is maybe the agent is maybe I ask a question like get me all of the customers favorite products from the database. And what might happen there is the LM might go and query for every single customer one by one which would be an insane number of queries. It's going to take an insanely long time and it's going to cost a lot of money. So I don't want that to happen. So I'm just making it clear that I've set a limit of 30 steps for each interaction turn. Okay. So with this in here generally I mean generally the agent is not going to go above just a few steps but just in case I have told it there's a 30 step limit. Okay so try not to go beyond this. Uh you could be probably expand on this bit a little bit just to be a bit clearer but this should work. Then finally we're using kumar frame here. We need a reference guide and that's what we insert down here. So we have our PQL reference that we downloaded before. Okay. So this big thing here and that is our system or developer message as OpenAI now call it. All right. So we have that and the way that we set this up is we create a state for our graph agent. And I'm just putting this in the event history here. Okay, it's the first message. And then later on as we go through and as you saw before with each node each with each assistant response with each tool call tool output all of these messages get appended to the events here. So that's how that's our chat history. But of course it's not just chat history it's everything. That's why I call it events. So we have that and we also just in our state here we do include access to kumo RFM. So our tools can call that. We also include access to the data frames. So our tools can call those as well. And of course open AAI there. Okay, let me run this. There we go. So we have that and now it's time to set up our graph. Now this might look like a lot, but it's fairly simple. So I'm setting up my initial graph like rules essentially. So I'm saying I don't want to have any more than 30 steps through the graph here otherwise stop. We set our state here. So this is just this here. If you want to reinitialize or refresh a conversation, what you do is you would reinitialize your state and set it here or just actually reinitialize this state object and this initial state here will point to that. So that's how you would refresh a conversation. And then I need to add nodes to the graph. So I'm adding the nodes here. So we have again just five if you remember start to lm which goes to the two tool nodes and then lm finally goes to the end node. Okay. So that's what we have here. Then we have our special routing mechanism which is okay. LM is going to go to either one of the two tools or go to the end which is what we're saying here. So we're saying from the source. So the source will go into our LM and the LLM is going to make a choice as to which destination to go to. Okay. So that's our routing mechanism. Uh this routing mechanism if we had a longer graph it doesn't need to include all nodes. It just happens with this smaller graph it does. It just needs to include okay where you're coming from your router node and then where that router node can go out to. Okay. And then finally we have a couple of edges that we need to create. So the kumo RFM tool when it gets triggered it needs to go back to our LM. So we define that here. We also want to say the data frame tool it needs to go back to our LM. We define that here. Then finally our lm will go to our end. But we actually why do we need let me remove that. Okay. And then finally we just compile the graph. So this will just check everything is valid. Um so I commented that out because we already have that edge built by data here. So lm to end. So we don't need to add another one there. Now we can use our

### Testing our Ecommerce Agent [36:12]

agent. Everything is set up. So at this point what I showed you in that chat demo earlier at least in structure we have the same here at least on the back end behind the API. So that agent is now set up and we can start talking to it. So the way I do this is I initialize this event callback. That's just what I'm going to be using to stream. The reason I'm setting this up is if you're using this with an API for example, you don't want to be printing things in your code. you would instead set up a callback and this call back is going to stream the tokens from inside your API and it can pull it out for you. Okay, so if you're setting fast API or whatever else, you would use this event callback to stream those tokens through uh through your API. Okay, so I'm going to set up my state or specifically the events in my state. So I'm just going to update the events value to include uh what it already has because I'm appending here and I'm just adding a new me user message which is can you predict demand for article ID over this 30 days and then I send that with this graph execute and again this is all this async io create task all this logic here that's because we're streaming you don't need to do this if you just want a direct output It does make things look a bit more complicated, but I think in most cases you probably do want to stream. So that's why I'm showing you how to do the streaming here because I think it's just generally what most people need to be doing. Okay, so we run that. Kind of missed it there. So we can see the tool that is being used. Okay, then we see the query and we can see that actually wrote the query wrong here. Okay, so our agent wrote a query wrong. Wrote it again wrong here. I didn't include the tool output. You we could do right. So I could uh I could print that and we could see the error messages that are being returned but it's a lot so I didn't want to. And then finally it does get it right here. And we can see that Kumo RFM goes and does our prediction there. And then finally this here is the output. Okay. So the predicted demand for article this one over the next 30 days is approximately very small number. Okay. So yeah it's very low expected demand or revenue. Cool. That that's and I didn't write any PQR or anything there. The agent did it for me which is really nice. Okay. So that there super cool. Now I don't want to write this out every time. So I'm basically just going to take all this. I'm going to wrap it into this chat function here, and I'm going to let that handle everything for me. So, this is just the same as what I wrote before, but in a function. Okay. So, I'm going to say, what other useful info can you give me? We're going to basically go kind of through a similar conversation as to what we had at the start. A little bit different. I'm looking at maybe not the our most valuable customers, but the ones that are at a highest risk of churning. Okay, so we can run this. Let's see what we get. So we're querying. My question here is what other useful info can you give me? I'm preparing our monthly marketing email. So I'm being very open, right? I'm not asking anything specific here. So the agent has decided to get me some general insights about the product and it has gone and done that. So okay, nice. There is total number of two transactions for this product in the past. That was last transaction date and so on. Okay, that's cool. Now I'm going to say, okay, can you help me find some customers that are likely to churn? So let's run that and see what we get. Okay, we use kumo, of course, because we're making a prediction. Uh, but then of course, so we're making a prediction and then we might want to go back to the query data frame to pull out some particular information and sort of go back and forth there. Okay, so you can see that happening here. Some of these might also be errors in the queries. I'm sure you know a few who those were actually. So if here what has happened successfully is we went to query dataf frames to get a list of customers who had transactions in the last 90 days. It got those out and then the agent went to kum RFM and looked over the next 30 days for those customers what number of transactions are they expected to have. Okay. Okay, so we looked at one customer as a test and it had they have a 45. 6% probability of churning and a 54. 4% probability of staying active. Like that's I think pretty useful information and you could just like run these sort of queries through your full data set and get like a huge data set of who's the most likely to churn, who's most likely to stay active, all this sort of stuff really quickly, which is insane. So, I'm going to say, can you get me a sample of 50 customers? This is quite a lot. And you'll find one thing to just be aware of here, especially if you're building something like this for an actual product, these really long customer IDs that we have, they take a long time for the for an LLM to generate because it's many tokens. Like, most of these characters are probably just one token. So it's almost like you're feeding in like 30 words or generating 30 words just with one ID. So one thing that you might want to do is create some mappings here uh to shorten these IDs so that your LM can do all of this a bit faster. Uh but anyway, getting the sample here was fairly quick. It's just pulling it out. I'm just going to say okay, I just want to be a little more precise here. We're getting recently active customers. You can see here that is running the customer ids now. Okay. So as I mentioned these are really long for an LLM to generate. So this query is going to take a long time which you know maybe in this sort of scenario where we're doing some like heavy analytical work it's probably not a bad thing because the amount of time that it would take you as a data analytics person to write would be quite a while. So, it's not too bad of a use case for it to be a little slower, but yeah, 40 seconds it can be made much faster just by creating some mappings for those IDs. All right, so we have out of the sampled 50 customers, 20 of them been recently active. Okay, with last transaction occurring within the past 30 days. Cool. So I'm going to say let's filter down to the most liked churn who also have the past purchase history with us. Okay. So we're creating some predictions. So this is the churn prediction here. You might recognize it from earlier. So this is we're predicting over the next 30 days if a number of transactions is going to be equal to zero. Okay. That's the churn prediction again. This actually errored here. Another error here. So, let's try again. I think we're possibly passing too much too many customers to it. So, let's see. Okay, that one was successful. Once it gets to Kumo RFM, it's really fast. It's like two seconds. But, I mean, that that's kind of the point. Of course, LLMs, they're not good with a lot of data. more traditional methods even not so traditional like GNN's can be much faster. So okay we have from the recently active customer sample of identified customers mostly churn. Okay that looks interesting. We have a customer churn probability as high as 95%. And another not likely to churn basically. So I'm going to say can you give me the top three most likely churn. All right. So yeah, all pretty high churn probabilities there. And it's, you know, churn is just they're not going to purchase anything in the next 30 days. I don't know if I would count that as churning, but that's what we're using here. So I'm going to do what we did before. Let's write a personalized email to the first customer on that list here. We want to use what we know about their past purchases and predicted most likely future purchases to try and offer them something that might pull them back in. All right. So, we have we miss you. Check out these styles just for you. Then we're looking at, okay, we know what you've posted before. So, we're being very personalized here. Sandals from our shoes collection before. We appreciate your style. Keep it looking great. You might want to look at some of these other things that we think they're likely to purchase in the future. Okay. So, we have these. That's cool. And yeah, we can obviously adjust the language. I'm going to say, can you try and make it sound more natural? Of course, we're using lens, so it probably can. That's good. But I would still like to be specific here. So, I'm going to similar to what I did before. I'm going to say add the article ID is in square brackets and an image of the product will appear. Slightly different to what I said before, but it's just a example. Okay. So, then it's adding those. So, it's being more specific. So yeah, you can see there we built a like fullyfledged e-commerce analytics agent and it was super simple and all of the code for that or at least the example that I took you through it's all in here in the cookbook. Otherwise if you want that application the front end app you can get that as well. So that is

### Ecommerce Agent App Setup [46:14]

over here this e-commerce agent repo here. So this is really simple to set up. So if you just want to, you know, jump in and start playing with it, you can do. Okay, you clone it into here. You would uh cd into the e-commerce agent repo. You will need to just copy there is a if I show you here, you need to copy the envample file. So you do env. ample example to end and then inside NVX you'd have to do into N here. Sorry, it should be CP end. ample2. NE. My bad. But once you have that, you would go into I'm going to go into envample then. And you would just fill in your details here. You can actually have kumo API key and OpenAI API key. Okay, so you fill those in and then you just run docker compose up and this will build the full application. So you have the front end, back end and then there's also some telemetry stuff in there. I'm not going to go through that. But once that has built and deployed, you have your API over at Locos 8000. You see the swagger docs here. It's just one endpoint, the chat endpoint. Very simple. And then you have your application over locost 3000 and you can just talk with it here, ask you a ton of questions about your data. So really easy to set up. Now, I hope it's

### Ecommerce and Agents [48:00]

very clear from this how incredibly useful this sort of thing could be as a as an integration and not just for analytical agents. You could also think about doing this for you know customerf facing agents as well. If you have a customerf facing agent and the customer is saying, "Okay, look, I really like this product, but you know, I don't I want something for summer. " Okay, your agent can then go hit Kumo or even hit the data frame, look at what they've purchased in the past, then hit Kumo and look, okay, what are they most like to purchase in the future and find products there that are suitable for summer and would fit to the users's query. So, it doesn't have to be like a internal tool here. who can also be customerf facing. And I'm sure there's a ton of other applications for this as well. Just being able to add this predictive capability into your agents is insane. Yeah, that is it for this video. Let me know what you think. Let me know if you find this useful. But I will leave it there for now. So, thank you very much for watching and I will see you again in the next one. Bye.
