Studio Update #06: Fine Tuning to Tag Support Tickets? Plus Dynamic AI prompting via Spreadsheets

31:47

Studio Update #06: Fine Tuning to Tag Support Tickets? Plus Dynamic AI prompting via Spreadsheets

n8n 17.01.2025 5 320 просмотров 156 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

In this episode, Prompt engineering for AI Agents tutorial dropped! Then Max kicks off a collab with n8n's Support team and Jim Le shows off a handy pattern for dynamically injecting prompts. Follow Max on LinkedIn: https://www.linkedin.com/in/maxtkacz/ Here’s what’s inside: 🤖 Building AI Agents Pt3 is out: https://youtu.be/77Z07QnLlB8 ⚙️ Fine-Tuning Tiny Models – Breaking ground on a Gmail ➡️ 3B self-hostable llm model in your own voice that can write drafts for you. ✂️ n8n Support AI Project – A behind-the-scenes peek at meeting with n8n's Support team to spec out a 'Tag Support Tickets' usecase in @zammadhq . Multi-shot prompting or fine tuning? Let's see! ♻️ Dynamic Prompts with Jim Lee – Learn how description metadata in platforms like Baserow lets end-users update system prompts on the fly, no developer needed. Chapters 00:00 - Intro 00:58 - Studio Project Updates 03:24 - Kicking off Tagging Support Tickets Project 13:24 - Dynamic Prompts with Jim Lee 31:04 - Wrap up 🔗 Links and Resources: Sign up at https://n8n.io and get 50% off for 12 months with coupon code MAX50 (apply after your free trial!) https://community.n8n.io for help, inspiration, and connecting with fellow builders Connect with Max on LinkedIn: https://www.linkedin.com/in/maxtkacz/ #aiagents

Оглавление (5 сегментов)

Intro

all right have a look welcome to the sixth Studio update the show where I update you on AI and automation projects from myself and N atn's Global Community I'm Max the original flow grammar we've got an awesome episode today so stick around for some laughs you may even learn something though no promises in this week's episode I've got an update on my building AI agents tutorial series a quick check in on my first Explorations fine-tuning little Pito LM models and I kicked off a project with idan's support team to help them automate some stuff with AI so we got a quick peek at that as well I also had a lovely chat with the venerable Jim Lee who was a guest on our first episode um and he goes into a pretty cool patent he showed that I think is going to be powerful for anyone that is handing off AI solutions to other people and doesn't want to get pinged every time there's a small tweak trust me it's awesome all

Studio Project Updates

right as usual let's kick things off with some studio project updates part three of my building AI agents tutorial series is finally out it's focuses on prompt engineering specifically for AI agents it's designed with beginners in mind but unless you're a prompt engineering expert I'm sure you'll learn a thing or two so definitely do go check that out and follow the link now there's a lot of discourse right now in the AI influencer um circles with salacious headlines like you never need to prompt Eng enger ever again and all these AI tools that kind of help you with prompt engineering yes they're helpful yes I use them but I highly recommend that you learn what the hell you're doing there so you can audit the suggestions given by your AI agent cuz AIS are very confident in their results they're usually instructed to be helpful and you could be piping in a hot steaming pile of BS into your AI agent without knowing so educate yourself watch the video and do me a favor after you watch it share it with someone maybe you got a plumber friend who's dealing with invoice nightmares so yeah check that one out and please do give me some feedback I don't think it's perfect but it was again it's the longest tutorial I've ever made and I think I did a pretty good job of balancing depth and actually teaching you some stuff I agreed with angel that he's going to update us every 2 weeks to give him more time to crack on so no pressure Angel a few moments later as mentioned last week the next thing we're going to be picking up is looking at fine tuning so I'm seeing all these folks talking about fine-tuning tiny little Pito models and having perform Super well on specific tasks to explore that I'm trying to fine-tune a 3B model that can write Gmail drafts for me so far what I've done is I've ingested a bunch of emails from myself and kind of created a bit of a pipeline to do that I've been able to turn it into the right format that I'd need I also have it generating some synthetic examples basically I'm like 80 90% done on having my Ines pipeline for Gmail so the next step is to figure out how to find tune a model now I'm trying to find some experts on this cuz I think collaborating is definitely going to be a bit more fun so if you are a fine-tuning expert you got a bit of free time on your hands and you like to get some eyeballs on whatever you're doing hit me up because I'd love to learn and I'd love to basically take your expertise and dumb it down for everyone else so they don't have to learn it unless they want to of course so yeah now that the building AI agent series is shipped or at least the first three parts are shipped I'll focus on that next collect some feedback on the building AI agent series and then make a sneaky part four I also got to meet with

Kicking off Tagging Support Tickets Project

nen's support team so they basically did some thinking on some problems that they'd like to solve with Automation and they've conscripted your friendly neighborhood flow grammar that's me to help them with that so we jumped on a discovery call this week and basically structured how we're going to approach this now I recorded the meeting and cut it down because I think it's going to be an interesting thing to watch if you're thinking okay it's 20125 I want to add AI to my business processes but like how do I do that how do I start thinking about meet with the people that have the problem perhaps discover what they need and get a game plan and get going so check this out hey guys how's it going hey Max hey so the support team has joined me today because we're going to have a look at some of the paint points that they have in their day-to-day work helping The Neden community and we're going to help them automate and use naden to solve the problems of naden how does that sound aesome amazing sweet well Ria was kind enough to put together a dock going over some of the stuff that you guys kind of have On Your Horizon R would you mind walking us through this yes we use a ticket system called zamat and this is also the integration that we will be needing in our workflow we get support emails and according to the different content that they have we want to categorize them and we want to use AI for that so we've got all these requests coming in from users they're coming into zamad and then you guys are tagging those tickets with like some tag text on me that you have to kind of make sense of all that and organize it and of what your what you're looking at automating is that right yes so at the moment this is done manually like we categorize according to the content is it maybe like a billing related issue is it a bug and then we apply labels which also helps us for a statistical purpose because at the end of the day we want to know and the product team also wants to know what's the topics that users are messaging us about you know what's our pain points where's the impact but another use case is also categorizing for triage in order to determine which priorities we have to work through the tickets it can be done you know Enterprise users the severity of the issue that they're reporting got so if someone's like instances down and they're on an Enterprise plan obviously that's super critical right we got to like react to that quickly makes a lot of sense sounds like classification is a pretty good use case for AI are there any other ideas of problems or you kind of want to focus on this tagging issue because I guess that affects every ticket right it affects every ticket and at the moment because we're just starting out to get more AI into our processes it's it seems to be like the most simple to start off because at some point of course we want to enhance this by using AI like a full-fledged AI bot that responds to the users as well with potential Solutions or references to fix the issue so this is just the beginning yeah that makes sense now thinking about this how long would you say the tagging process takes you yourself per ticket sometimes almost half a minute and it's also iterative so as the ticket maybe uh grows in replies there may be other tags because the content changes and I imagine that as the team grows as the number of tags grows the rate of human era could also increase there right the time to do it the time that someone new doesn't know about some new tag or something right exactly sometimes when we introduce new tags obviously we don't go back in time and retag some older emails and the other one is that sometimes you just don't tag at all because you forget that there's a tag for it but it would be valuable to include that in our records of tags for these emails and then how many tags do we have total right now and how many do we expect to have in like I don't know six months I oh my God I think it's maybe like 40 around like that okay so and it shouldn't really be growing that much but it depends on how granular we want to be in our tracking so in terms of scoping this down I think what we could try at first we could potentially use two different Frontier models we run it through you know an advanced GPT model and through an advanced anthropic model make sure they have the same result and if not it gets assigned to a human to do for example but then if both models say yes it's these three tags maybe that could be our definition of confidence of high confidence and then to automatically apply the tag right now because I guess when thinking about how good the solution has to be right because humans have error rates systems have error rates the more you work on a system the more you can reduce the error rate what's like what's the worst case scenario if something gets the incorrect tax at the moment the worst case scenario might be not including it in our statistics so the worst case scen is it's not it doesn't really have like a big impact if we don't get the tags right the bigger impact would be further down the line if the AI replies to the user some rubbish yeah that makes sense I think no matter what we do we will have to be tracking even if it's not in zamad but somewhere ones that did get Auto tagged right I think the next thing to try is a spike so if we say look let's take a couple hours let's try a simple naive approach and let's see how good that approach is to inform whether we need to do something more complex what I would try to do these we probably want to use an open source model right cuz we're piping in like pii data so we got to self-host this model and with self-hosted models they have quite a lot of context window what I would try first is taking some examples from zamad so multi-shop prompting and see if a decent sort of system message with those examples is enough to categorize on that system because I think the things that we'll need more specific examples of is the more Niche tags I think the llm today if we needed to have a classification system that defines if it's billing or anything else it's not billing like it could probably do that already without any examples today but maybe less so for some more specific stuff the thing that's really going to depend on how many specific examples we have how similar those tags uh but that's what I would try first that could be a couple hours or less and it's working and it's done and we ship it and we're happy if that doesn't work we could then look at fine-tuning models or even if it's working fine-tuning models to have a more performant efficient system to show people what you would do if you know your ticket volume's 10x this because then I'm fairly confident with some of the research I've been doing that we could take a small model maybe a 7B or 3B model we create training data from your existing tickets I imagine we have over 200 tickets that have like correct tags at this point we get 200 tickets a week or something like this so yeah I think that's a good approach we'll try the simple version First and then if we need to do the fine tuning example sounds we got like hundreds thousands of examples we know a pretty good ETL tool that can clean up those examples and prepare the training data hit hint it's n at n i see this could be famous last words but I see this as potentially like an S maybe m-sized ticket maybe it's excess but this sounds like something we could get done and build you know and help the team out pretty quickly yeah I think so too I think my biggest uncertainties are right now how to self-host the model where like yeah what's your experience with that so I don't have so much self-hosting experience I have experienced it a little bit with oama so what I would say when we're building this out we should test just running a model locally so I think step one is getting it working locally with a model because if we can get it run to run on my MacBook we can have a chat with the engineering team if we could show this is working and this would be saving X hours if it was deployed it's going to be a lot easier in our organization to get that deployed Ria could you prepare for me some of the example data from zamad and I would say there Ria if you think there's other metadata that could also be useful for categorization for example if the email has a subject if we already know if the use is on cloud or not please add that in as well because that could be very helpful for the example or fine tuning set as well and then specifically which tickets you pick the guidance is that obviously your examples should be like representative of what you're going to see maybe the easiest way to do that is make sure that there's at least a couple examples for each tag that we have in the system nice did you mean to include like bad examples as well like how not to do it so far I don't think we need anti examples so far just examples of help honestly if we have all these examples and I'm able to pipe in 20 30 examples in context and that's not working reliably let's go the fine tuning route cool okay great sweet so since we've got your boss here how does that sound it sounds amazing yeah I mean if we have this data it'll be amazing again it could be famous last words I I'm an optimist so I often go like a this is going to be super easy and reality hits once you've gotten that data to me I'll schedule a build session and then we'll jump on the phone and have a look at it and go from there oh great I would be curious which model you choose I would also be curious I have no idea right now I'm really going to try with some of these smaller models because for classification still have good context Windows let's see I'll probably try llama some flavor and then I'm going to see sometimes there's more Niche models that aren't as popular but they'll specifically be trained for classification I'll have a look but I'll sh those learnings I'm so happy that you're doing this on your computer because I think I only have like HGB so you know what's cool though some of these very small models like 1B and stuff can run on HP a few moments later thanks so much super excited to get going on this project I'll see you soon bye see you Max an awkward pause so as a quick update I already heard from Ria she's making good progress on getting that data set all prepared we're just figuring out exactly what metad data we should get and stuff so once she sent that to me I got to stand up a model locally because again that data is rif with personally identifiable information so I hope that's evidence that Idan cares about your privacy and about not being fined massively by the German government of all right let's

Dynamic Prompts with Jim Lee

keep things rolling rolling next up I had a chance to me with the venerable Jim Lee some of you might know him from episode one Jim super smart guy he showed me a pattern that he figured out that it's a bit hard to explain but basically he's exploiting the description metadata available in a lot of sheets like products like bass row and no code BB and using that description in the system prompts for his AI steps so they're Dynamic so the end user can just modify that in bass R let's say and then it updates their AI workflow I'm not doing a great job of explaining it so yeah just watch this hey Jim how's it going hey Max I'm good yourself doing very well Happy New Year to you mate 2025 that's right so for the next couple months I'm going to be uh misspelling all my dates Jim what have you been up to since our last uh call yeah over the last two months just kind of been working with and E we'll see very cool is that for client work or how working but also we kind of exploring some of the new last minute releases last year like Gemini 2 very cool so for anyone who's not familiar with Jim's work Jim creates a bunch of really awesome workflow templates in the n& n community and we had them on the show previously but for everyone else Jim could you give a quick introduction yeah so I'm a freelance developer started early last year and I've been working exclusively with NN for all my AI workflows building a lot of ETL Solutions taking in invoices and Bank needs most very cool now Jim has a solution that he wants to show us today it's more of a pattern that you can apply to various different use cases and the reason it's super interesting to me is firstly it's not an AI agent so yes you can have create value with AI without agentic Solutions and I think Jim was showing it to me before the call and it's something that allows you as the developer to give your end user more control on that note Jim would you mind showing us what you've been working on absolutely my pleasure hey everyone most of last year when we were sort of working on the AI extraction world's slowest started to realize it was very static so what I mean by this is take this example where think this spreadsheet for example I'm using base Ro and in there we can see uh there's four columns file full name address and email if I click on the C is a test CV usually someone asks us to build a workflow to pull out certain attributes and data from the document the problem being is that these are usually St defined up front and if we show you a quick workflow here so this workflow here is how we would do it is we would have webook to run this and we would pull the roles that need to be filled we would download the file attached to that role extract data from that CV and then use the information extract and if information extractor takes the text from our PDF and then pulls the Columns we need can use now immediately if I run this I double click this by we can see it's done just that so it's pulled name and then what we've done is we've mapped it to update our base roll fantastic now you immediately see the problem with this is that if someone adds another column this one out we would need to go back into our factor and add the attribute here I guess you're also going to have then all those previous records right that don't have that backfield as well by default right exactly so we have a sort of schema sying problem but also a r syncing problem and you can see if you were to run as man it would be no problem but if R aut it's kind of a nightmare because you end up being on the hook as a developer of the workflow you end up being on hook on maintaining this template and being pinged every time that someone wants to change something or redefine the criteria add or more Fields columns soit Jim you're telling me that you don't like to be pinged to change schemas from workflows is that correct just on camera for the record okay good to know so here in this scenario all the power all the sort of responsibility is on the developer now what we've been working what I've been working in the last few months is what if we could turn that around complete 180 and put the S the prompting the definition of the schema in the hands of the user instead and this is where we explored a dynamic column prompts so in this workflow we have switched away from defining structured output in our llm and nodes and we've mve them into Bas R how we do that if I switch the Bas roll now and move this example I have the same setup but you notice there kind of midal roles in this field I've defined the attribute name and in this description is where I've chosen to and the prompt so here we can see full name of candidate prefix and what I've also done is as Bas allows sort of very granular web hooks I've added here is once a rle is created or updated or a field it runs my Dynamic column prompts template what does that do so Jim is it basically when you change the column schema or when you add a row it's basically pulling that schema live before it gets to the extraction step and it's using that description written by the end user here in bass row to basically tell the LM like look here's the name of the thing to extract but here's what you looking for is that right exactly so of the schema the attribute the column attribute you want to extract and the prompts live in the base roll spreadsheet and n 10 pulls that in and does the rest to fill in the values so I can give you an example here so if I say hey I want to add a new celd address I can make this long the address app and I create what this now does is it kicks off uh to get the address now here it's come back with na because this Tes doesn't really have it address so what I can do now I can edit the Feld um I put nearest Sydney if not save it now this is also going to because there's a web hook forer event type uh field updated it will run the web again run the template again we pl and it will come back with the updated prompt result so that where it just says mithan City fiction land that is you updated the description bass row then saw that created a web hook event that your work forl picked up with that new description that's dynamically being piped in it now knew to go oh well I do know the city let me pop that in instead is that right exactly right this is so freaking cool my mind is currently blown so now you can see how all the power is now on the end user so the end user can add EMA type or EMA alling and so on so this will keep triggering now the best part is that it's unlikely that we will only have one row if we add in other CVS in here added my own base roll will send the the file and then n 10 will use DM to find the missing uh values and populate them with the prompts that are in these columns as well very cool so once again here it took that PDF ran it through the dynamic description prompts and just outputed th those three values in that row very cool stuff Jim would you mind showing us sort of the key moving parts that are powering the solution yep my pleasure so this is the template that we working with so the base roll event comes in and it provides this is a whenever you update a field updated so this creates the field that well this is this just triggers the field which was updated which is the address dat in this first part we get the schema from base itself so we have to start to ensure it only runs once because we kind of only need it once and here it is we have the name of the field and then description which acts as the prompt that switch node is basically routing based on the task type so if you updated a row it'll route to the Top If You're updating a field it routes to the bottom gotcha yeah so the bigger differ the big difference is if you update a rle you only expect that role to be updated but where you update the column you won't update all rols under that field so now we've got um B fields and what this does is because we are only interested in the Columns with have the descriptions or we want to filter out ones that don't have so this is just a simple code filter which filters out obviously the name column or the file column so we don't all write it this is and that does next this is where we so here where we get the PDF this is where the PF gets we extract the data and this is how so deep dive into the prompt here so this so the sem dat is Within These XML tags file which is sort of a fent posting to the LM it should look for the data to extract from data to extracts the description is used here so this is where the mini prompt is and then uh just a little extra is the output format which is we're getting the fields type so if it was a number it would um L then respond with number to input otherwise if the text and it kind of can do Pros we look down in this here you can see here this is where the description landed uh the mini PRP uh and this is for each field so field in this Loop where per field we're saying you have one job find the EMA you know and then we insert that Dynamic description okay what this means is you can have uh really long mini prompts mini large prompts I guess and it won't sort of like overlap or you know kind of there is no prioritization of tokens it's really focused on that field with that role makes sense and is there did you reach any description limits in bass row itself that is a good question I didn't but I had three four paragraphs worth last time I checked okay so there's a sufficient amount in the sort of size okay gotcha very cool yeah I would exp what's happening in the system message here is there any sort of specific details that you would call out as useful for this solution in this um in this uh system prompt because we want to generalize and the reason we would probably want to generalize is that in this system it's reusable for any sort of file based workflow the same so if it's here examples or job applications but you could do invoices or bank statements and then use that to compare or extract from bulk these documents so in this system problem them being specifically generic in that I'm only asking the llm to assist the user in extracting the requested data that they have in text in the file I've asked it to keep it short but then this is kind of a preference on how kind of depending on what you going use things for and given this last bullet point is an escape hatch so obviously we don't want the LM to eluc it or kind of like to please I guess the the user we can allow it to say any so not applicable or could find it and we saw that in the previous example when you were running it but I think what's really cool is that even though that's the default the description that the user wrote overread that so the way the system is built out right now the way it's being processed is the end user and bass Ro could still override the na functionality if they want to with a at a per You Know Field granularity very cool yeah I tell makes a very powerful system very generalized system that we can use and reuse for some mod so once I here we've got the address and the value with that address value we can use it to update uh base roll very cool and then this is just a simple I mean HTTP request right to the Bas row API to update that row and we know its ID because it's basically in the item coming in yep simple exactly yeah is there any differences with the process running on the bottom for the update field versus the update row is it just basically how many times you're doing it like on one row versus many rows yeah so this is um because this is a field update is sort of a column update which means that um it has to run through every role available role in the table but specifically targeting only that field that was updated this is sort of an optimization you could update every single column but depending on how many columns you have but that could take a long time so this in the bottom uh SE only specifically updates every Row for that specific column in the top half um because this is a role created or updated it will sort of go through just a single R and update every column in that Ro single Ro so going horizontally rather than vertic okay got you but I guess the the once we get to the actual Dynamic field itself it's a similar thing happening in the AI step okay that makes sense very similar yeah very similar so it's just it's just kind of pulling each column and its description using a prompt and doing it one at a time now I see that you're using the Google Gemini models what models you using specifically so in models I am using Gemini 1. 54 02 and this I found to be go very good kind of analytical kind of data pulling like it's not very creative in it and it's kind of always very kind of before extracting FS okay and being factual okay gotcha Jim or I heard you have another example of this pattern slightly different use case to kind of show folks how pal this is would you mind showing us that so in this example very similar to the previous template but it does taking PDFs but rather just the text so the grid I'm going to show you now is this one and this is a sort of content generating example of the same idea here we have specifically specified cats with the cats we have these columns which are description size d nutrition and temperament in the fields themselves we have the prps the idea is that it generates contents based on sort of the section so say was a blog or sort of a profile we can use these attributes and these prompt to kind of generate that so no PDFs involved no files involved if I show you here so if I type inam same idea with the web hooks if I on here it posts to my nnn instance and it Triggers on Ro updated fi created and field up World created it's not Ro created anyway sorry digress but yes so this has run uh with the name only the name in the background and it's used the l& m's training data that it already knows about SES cats to sort of fill in these columns I go to the template now so again same idea gets the schema gets the description which I Min prompts and Route it to have it update the role horizontally or update many roles field is updated if I go into the LM node the prom looks like this so the subject which was the name of the role and then description which was the column description acting as what is the data to should actually say generate and then in the system uh prompts yeah just simply rewarding or assistent user in generating the required content from the given subject or Al the given subject so basically similar you're giving it the escape hatch I really like that term by the way so you're giving it an escape hatch you're giving it basic role prompting and then from there it's working I guess the nice thing with this is so this is the basically AI step that's doing the Gen AI but it's a simple chain right but I guess theoretically there's no reason that at this step again the general pattern that you showed right but you could swap with a different AI step this could be an agentic step with tools to research maybe you got a rag Vector store of cats that's proprietary exactly or something more useful I don't know maybe there's a lot of business valant cats but and then it could be outputting information that's not just based on the LL model right it could be based on you know all your proprietary data pii information something but this is such a cool pattern Jim thank you so much for coming up with it and documenting it yeah fantastic Thanks Max I think when people are starting out they're going to go with the very static near flow and they're going to quickly run into kind of these issues and yeah definitely with this pattern you can be one step ahead and create a lot of dynamic and increase productivity and really EMP power and the people who don't necessarily touch the tools makes a lot of sense and so I'm guessing this is already super valuable for the clients that you're kind of deploying this pattern for as you look ahead any ideas for what you might add or where this might go yeah like you said we can definitely swap out these kind of basic El for agentic notes the tool reaching into you know you could have sales spreadsheets where you could pull out product information from the backlog have it to compare so performance of you different items but yeah it's definitely this is a great pattern to extend just kind of up to your imagination where how you take it yeah so I think the challenge is now on the community guys you saw the pattern what are you going to apply it to and I think one really nice side effect of this is I'm seeing a lot of folks talking about how sometimes the best person to write some part of prompting is the domain expert so let's say someone is a Content writer on cats actually they might write a much better description for what they need for exercise than the developer implementing it for them but there's this asymmetry where they can't really be in the prompt or they can maybe sit by the developer when they build that but then again you become the bottleneck when they want to change three words in the prompt but this removes that so I'm guessing your inbox and your slack is a lot happier as well after implementing such patterns exactly very cool Jim this is your second guest appearance on the show every time it's pure gold so I expect I'll be badgering you soon enough for a return and share some other interesting insights from like building on the ground that's something I really appreciated about your work again because it's ground in client work it's not this theoretical stuff it's like you have to solve real business problems or you're not getting your invoices paid which leads to great ideas for all of us so thanks so much Jim happy FL gramming as always and can't wait to chat with you soon thanks Max all right pretty cool Jim's a super smart guy can't wait to have him on the show again and uh and make me look good thanks as always Jim Let's cross the

Wrap up

street without getting run that's that's everything this week what worked what didn't drop a comment and let me know completely unrelated notes apparently the YouTube alos Love comments with more than seven words but again that's completely unrelated to the requests for comment I'm Max this is the studio update you're awesome we catch you next week and happy and happy flow programming

Другие видео автора — n8n

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник