‘Current AI developments at n8n’ - from the Amsterdam Meetup (November 2024)

19:35

‘Current AI developments at n8n’ - from the Amsterdam Meetup (November 2024)

n8n 28.11.2024 1 287 просмотров 41 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

In this talk from the n8n Meetup in Amsterdam JP van Oosten,Engineering Manager AI Team at n8n, shares updates on exciting AI projects we’re working on, including improved chat and log interfaces and the new “nodes as tools” feature. Links: - Read the full report on the Amsterdam meetup here: https://community.n8n.io/t/amsterdam-november-2024-meetup-report/62428 - Interested in hosting a community event in your area? Become an n8n Ambassador: https://n8n.io/ambassadors

Оглавление (4 сегментов)

Segment 1 (00:00 - 05:00)

uh if you're familiar with uh AI within nadn uh a bunch of this uh will uh be relevant for you in the near future um these are some of the uh improvements that we've been working on uh recently or that will that we're working on right now and so if you have an agent in uh n8n you're probably familiar with the chat trigger right that allows you to uh host a chat instance on your own website or try out typing in uh naden uh itself so I have a very simple AI agent here um that is connected to in my case Claude uh um and then something that agents do is talk to tools and we had a couple of tools that were uh already very useful so HTTP uh request tool and you could call out to other um workflows and stuff but we've been working recently on implementing other nodes that we already have as tools so in this case I pick two like an air table uh node and a telegram node uh that helps me for example do uh approvals and stuff like that if that would work with the AI agent um so this agent will uh send me some information from Air table summarize that for example and do that and uh normally when I click the chat uh button down here to start the chat uh manual uh testing procedure it would overlay the chat model on top of the entire canvas and that didn't allow you anymore to see what was going on the canvas so what we've implemented now is a way to uh get this in a Paine so you can see the logs here and your chat here and so you still see your workflow at the top so that means that when I run this you can actually see uh what nodes it's uh actually uh executing so if I say here uh send me all the um data uh from employee with the name Mario to my Telegram and then uh you see it will start processing uh the agent it will process that again and then uh we'll it took some data from the air table and it now send it to my um uh telegram uh um from my to my telegram messages so I can see it here this is unre message here's the data for employee Mario so I get all these orders um so um that is all being done by this agent and I can inspect that from the logs uh as you are familiar with because it has already been there uh before but it was all overlaying uh what was happening over here you can still see the chat what it sent back and one thing that we did to enable um these nodes as tools is you need to send parameters to these tools right so uh with the HTTP request tool that has a very big form that you have to put in all the parameters and stuff like that um what we did for uh these noes as tools was introduce uh this uh expression based uh way of sending these this data back and forth between the agent and the tool and we call this the from AI function and that allows you to uh put in the name of the parameter that you want to send uh to this Tool uh from the AI agent and so there's this uh this is how it works you can actually uh set description as well if you have multiple of these uh um uh placeholders then that this allows you to figure out uh how to distinguish that between them so if you have employee name or customer name maybe that is something that an agent might be confused about so you can give it some extra description a type and default value if it's not uh not passed on most of the time you'll just use the key uh for the placeholder name so this is uh this is how that uh Works um if you are ever working with memory and things like that then you can also uh do some stuff with a session here and things like that

Segment 2 (05:00 - 10:00)

y for this thing where you like Define the employee name yep uh it sort of looks similar as asking like if you have the regular open AI note and you say like output in Json and put the employee name in the blood blah yeah is that similar yeah so so it's also using the expression uh function of uh so it's basically part of the expression partial that we use right so we look for these from AI function calls uh to both uh figure out what we need to send to the agent to know what it needs to send back but also to know where we then need to put that back into the parameters for these nodes so it's serving a dual purpose okay cool thanks any other questions so far so these are the two things that we've been working on hard uh for the last couple of weeks so that's these a bunch of tools that you can now use so it's pretty big now so you can use air table base row a whole bunch of these uh tools that you can now attach to your agents if you want so if you use notion a lot you can use that you can use postgress of course uh all these kind of uh things um uh yeah um then there are some coolu coming and um so one of the thing that my team is now working on is evaluation and evaluation is the thing that if you have a workflow and you want to make sure that the workflow keeps working while you change for example your prompt right you want to know whether the responses are still made still make sense so what we what we're building right now is a way very flexible way for you to uh put in things like uh create a data set uh figure out how to actually test my workflow to make sure that it it's consistent across things and part of that is the executions annotation so I have a bunch of uh executions here and I can give that a rating I can say for example I think this is a good one but this one is bad and this will then factor in to the evaluation uh feature that will that we will probably release early next year so um this is what we've been working on to do that yeah y it's an amazing feature I think and it looks a lot like lsmith yeah of course we took inspiration from a bunch of places to figure out what we want to where we want to uh and like the current agents they are based on like chain right yes and is that causing any issues potentially like when you so the question is does using Lang chain and evaluation cause any conflict yeah so no I don't think so uh Lang chain offers integration with lsmith if you want but we want to offer you a way to do it in your workflows because we want you to be able to test your workflows also not just with Lang chain tools but also with any other uh nodes that you have right so if you have just a normal workflow that you want to test whether it's doing for example the code node is producing the right code right output and stuff like that you could do that with our evaluation feature awesome yeah and um yeah so as it will uh look something I will close this one it will look something like this you it will probably change a lot because we're still workshopping the front end part of it but you will be able to uh tag these uh executions and know what is in there um uh so you can then base your tests on that so uh you can set some metrics for example uh accuracy or uh appropriateness or things like that like is it is the summary good enough uh does it contain uh the words on the bad list you know stuff like that is what you could do in one of those evaluations list and if you set those relevance or coherence kind of metrics you can eventually plot them over time interface all still in progress um but yeah this is uh this is a very interesting feature that I think is going to help a lot of people move better uh to production uh with their workflows um so meant to like keep evaluating it when it goes into production um on that like even the models can sometimes change a little bit yes so you we're thinking right now for now it's probably run once uh when you

Segment 3 (10:00 - 15:00)

run it manually the test so for example if you change something it will run we're also wondering about like how do we make it in the interface correctly that you could for example run these tests every night for example or every uh Sunday evening or whatever right so uh that is something that we're still thinking about how to do that because especially if models change you want to know about those um other things that we're working on so I introduced these nodes as tools that these service nodes that you can then use in your agents uh the from AI syntax is not always very intuitive or discoverable so we're now thinking about ways to help people discover that and work with them more easily so that you can easily configure the parameters that the agent needs to communicate with the Tool uh with and so that those are tool Twigs a whole bunch of small tis uh coming but especially on the subw workflow tool if you ever use an agent that needs to call a different workflow in NN that's what we call the subw workflow tool that's also going to uh change uh quite a bit uh especially in uh things like creating a new subw workflow so right now you have to first create the new subw workflow then figure out the idea for that new workflow and then copy it back into your agent tool we're going to change that make a button that allows you to create a new subw workflow um and then debugging is also a pain in the ass right now so what we're doing is uh adding ways to go to the subw workflow execution from your logs panel so if you have this agent and you have this uh workflow tool then you can click a button to immediately go to the execution of uh that uh subw workflow so that's basically uh the over overview of things that we're working on right now uh and that are hopefully coming very soon some features like the evaluation uh feature is quite a big one uh probably early next year um the other stuff might even be next month so yeah questions maybe understand incorrectly but are the features you are now talking about to Stu is that all AI stuff or also for the normal work yeah so um some of it is also uh for normal workflows like the evaluation you could do that without any AI component we are building it right now for the AI Community because we believe that evaluation is especially useful for things like uh making sure that it stays consistent over time uh if you change a prompt that it doesn't break down as much and things like that that's why we're building it in a very flexible way because that flexibility needs to be there uh because uh you might want to have an llm as a judge right so that is something that we're looking into uh main from that angle um also the subw workflow tool yes this is mainly coming again from that AI agent stuff probably we're also implementing this in the normal execute workflow uh stuff yeah we are using a lot ofare flow yeah obviously yeah you already improved the interface for worklow selection right so no longer need to enter the ID of the workflow but you can now select it from yeah I don't know if you already saw that but yeah so previously you really had to copy the idea uh the ID for the subw workflow but now you can just go to like if I if it's from this database it will already give you a drop down list with uh with all the workflows yeah that's already available now I think already since yeah no yeah so look it up yeah go ahead yeah um like I think the evaluation thing is like absolutely massive because like one of the things that we in struggling with like we've been trying to build uh we have group of AI agents and then we have like AI who testing those agents and it's basically when the AI is talking to the AI it generates a lot of data and we struggle to kind of like evaluate that and you mentioned that then the evaluation would basically allow to have ai yes analyze the discussions between our tester Ai and the group of AI agents yeah so uh so the question is basically do we uh allow with this feature a way to evaluate a whole bunch of agents that are working together and so basically what's happening with the evaluation is that you get another workflow that's that does the evaluation

Segment 4 (15:00 - 19:00)

but we do it in such a way that for example you can label these executions that will then be run through the original workflow then you will get the output from that will be passed to another workflow to actually test your workflow so you can do anything that you want in that workflow even very complex stuff as long as there is coming out some sort of metric that says yay or nay right and so if you want to analyze that whole chain of Agents of course you can do that yeah okay like so for instance like if uh we uh have given the tester agent uh questions that we know that like uh our manager agent should pass on to like an FAQ agent who's like wearing a database and then we would be automatically able to flag when uh it has failed to produce that we know it should be able to produce yeah so you can dig down into like the parameters that are being passed between agents and and other agents uh we we're not going as far right now as to also allowing you to debug everything that's happening in a subw workflow so you would probably need to build a test for your subw worklow separately but you could test it end to end right you could test like is something good coming out of my subw worklow because that is something that's being present in your current workflow yeah does that make sense yeah cool any other questions yes sir uh I don't know if I have a specific question and the food's here so I'm we be quick uh this problem of non-deterministic squishy stuff happening yeah interfacing with deterministic stuff that's fromi called air table yeah exactly yeah I don't understand how to test these workflows in a way that I can get it to improve I'm building a rag trying to query some documents I wanted to get better at answering I understand from these presentations I can put examples I can improve instructions but how do I measure that it's getting better I need somebody to judge right a reviewer yeah is there any systematic way to improve it so the basically the question is uh how to check if my changes to my prompts are improving the results right you could do that with that evaluation feature that's coming then right so what you could do is say I'm I'm building another test agent I put that in my test workflow that checks okay does it have the correct language is it answering the question that was asked things like that right you could put that into that test framework and because it's squishy and like language stuff an llm is then again in a good position to actually test that what you then want to do and then this is not probably not going to be in the very first version but you probably also want to say okay this what went in and this is what we actually wanted to see and then you can actually test if your uh prompting changes it towards more of the things you want to see and less don't want to see so yeah I think uh we're wrapping up here um just one small question a small question very small question um what about privacy touched about that question just before that yeah uh in this particular case there is a good chance we will be sending some sensitive data from the internal database and generic way to handle that yeah you could use self-hosted uh AI instance right you could use AMA we support that we support changing the URL for open AI uh node so if you have a selfhosted or an API that does it better then you can use that yeah especially if it's just simulating the open AI API then you can uh insert that so yeah food thanks p

Другие видео автора — n8n

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник