We Maxed Out n8n - Here’s When It Broke
13:32

We Maxed Out n8n - Here’s When It Broke

n8n 07.07.2025 13 797 просмотров 292 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
In this video, we stress test n8n across two AWS instance types — C5.large and C5.4xlarge — using both Single mode and Queue mode to see when performance breaks under pressure. We ran three critical benchmarking scenarios: * Single Webhook: one flow triggered repeatedly * Multi Webhook: 10 workflows triggered in parallel * Binary Data: large file uploads and processing Each test scaled from 3 to 200 virtual users to measure: * Requests per second * Average response time * Failure rate under load We used K6 for load testing, Beszel for live resource monitoring, and n8n’s own benchmarking workflows to automate the process and visualize performance over time. Want to test your own setup? - n8n Benchmarking Guide: https://docs.n8n.io/hosting/scaling/performance-benchmarking/ - Queue Mode Setup: https://docs.n8n.io/hosting/scaling/queue-mode/ - Docker Installation Guide: https://docs.n8n.io/hosting/installation/docker/ - K6 Load Testing: https://k6.io/ - Beszel Monitoring: https://www.beszel.dev/ - n8n Benchmark Scripts on GitHub: https://github.com/n8n-io/n8n/tree/master/packages/%40n8n/benchmark Whether you're a solo developer or running an enterprise automation platform, this benchmark shows exactly how far n8n can go — and where it starts to fall apart. Subscribe for more architecture breakdowns, real-world benchmarks, and workflow optimization tips. Want to connect? Find me on social media and reach out to me directly: - LinkedIn: https://www.linkedin.com/in/angelgmenendez/ - X: https://x.com/djangelic #n8n #loadtesting #benchmark #queuescaling #devops #k6 #docker

Оглавление (3 сегментов)

Segment 1 (00:00 - 05:00)

Hey everyone, welcome to the ultimate stress test showdown. Have you ever wondered how much your N8 server can handle before it cries for help? If so, you're in the right place. In this video, we're going to be putting different cloud hardware setups through their paces. We'll simulate heavy traffic, max out resources, and see which setups come out on top. Think of it as a gym workout for our servers. No weights required, but we promise it'll be just as thrilling. Now, before we dive into the numbers and the nitty-gritty, let's talk about why it's so important to stress test in the first place. Whether you're running a small personal project or managing enterprise level application, knowing how your infrastructure can handle extreme conditions can save you from unexpected downtime and headaches down the road. Plus, it's just really cool to see what your hardware is capable of. Now, speaking of hardware, let's go ahead and see what hardware we're running under the hood here. So, we're going to be using AWS to do all of our benchmarking today. For the main benchmarks, we're going to be using a C5 large instance type from AWS. Now, if we go into the AWS instance types here and we do a search for C5. Large, we can get a better idea of exactly what hardware is under the hood. So, as you can see here, we're running two virtual CPUs, 64-bit architecture, 4 gigs of memory, and 10 GB bandwidth. Now, we're paying about 8 cents per hour to run this hardware and test it. Now later we're going to be running this on slightly beefier hardware. So stay tuned to see that. Now on top of running our C5 large instances, we're going to be running two separate instances of N8N. We're going to be running our single main mode instance and our multi-main Q instance of N8 just so that we can see how the architecture affects the final benchmarks. In addition to our N8 instances, we're going to be running K6. Ksix is an open-source load testing software by Graphfana Labs. It makes the process of benchmarking simple and easy using simple scripts to deploy it. Now to make the process even easier, NAD has deployed their own Ksix scripts that anyone can use. You can find them on our GitHub page here. What this is going to allow us to do is to quickly spin up an instance of K6. From there, what we're able to do is deploy multiple scenarios in order to stress test different aspects of N8. In today's video, we're actually going to be doing three different scenarios in order to stress test NADN. We're going to be doing a single web hook stress test, multiple and a binary file stress test. Now, if you jump into the GitHub repo, you can actually see all of the scenarios that they have loaded. They actually have several more, and in future videos, we're going to be tackling a few of these other scenarios as well, and maybe even building our own custom scenarios. And lastly, on top of the K6 instance, you'll notice that we've spun up a bezel instance. Bezel allows us to visually monitor our system resources on our N8 instances. What this is going to let us do is in graph form allow us to see what is happening behind the scenes over time. You can find more information at bezel. dev. It's a very simple, lightweight server monitoring software. Now that we know what's under the hood, let's take a look and see how we're going to be running these stress tests. When I first started using Ksix, I used this docker command to initially trigger my benchmarking suite. What I later found was that this process was very slow and I had to be constantly monitoring the server to ensure that the test was completed and to run the next one. Because I work at an 8N, I thought there's got to be a way to turn this into a workflow. And in fact, there was So what I've done is I've daisy chained three different scenarios across six different tests. And the way this works is what we're going to be testing today is the number of virtual users running these scenarios. So for example, our first test is our single web hook. This mimics sending a web hook request to an NAN server and sending a 200 response back that hey everything's okay. We received that web hook call. We're good to go. Now initially we're doing three virtual users at a time. That's not very stressful. So we're going to start ramping it up. We then move to 10 virtual users at a time, then 30, then 50, 100, and then 200 to see just how much an can handle. Now, once that's done, we then switch over to Q mode, and then we see how much can Q mode handle. Now, in an enterprise environment, when we deploy Q mode, we typically deploy Q mode across several different AWS instances. In this case, what I want to do is get a baseline to see exactly how much a single AWS instance can handle on both of these. So in this particular case, the Q mode is set up at a disadvantage. It's running one instance with multiple Docker containers and images as opposed to the N8 main which has only one NAN image that has everything packaged inside of it. Now, from there, we're going to be able to see where the bottlenecks are and see what version of

Segment 2 (05:00 - 10:00)

N8N handles what stresses better than others. To visualize these benchmarks, what we're going to be doing is deploying our workflow to trigger each one of these benchmarks automatically. It's going to use the spreadsheet to trigger each of these scenarios using these different virtual users. As it finishes, it's going to write those into our spreadsheet. And from there, we're going to turn that into a graph. Now, what this graph is going to show us is the requests per second that NADN is processing, the average duration in seconds to receive a response, and the percentage of those requests that fail. So, what this graph is going to show us in real time is how many requests per second we can handle, how quickly we get a response, and how many of those actually make it through. Once those scenarios are run, we're then going to go ahead and move on to the next one. So from the single web hook to multiple web hook to binary data, automating our benchmarking so that we can see at which point this fails. Now to visualize that, what we're going to be using is bezel. Bezel is going to go ahead and give us our CPU usage, Docker CPU usage, memory usage, and Docker memory usage. On top of that, within the Linux instance, we're going to SSH and run htop, which will give us real-time feedback as to what's going on within that server. And lastly, we're going to be showing the graph. The graph is going to show us our results for each of those benchmarks in real time. So, without further ado, let's go ahead and take a look and see what happens. All right, let's start small. One workflow, one endpoint hammered by an increasing wave of traffic. This is the single web hook scenario, and it's where we begin to see just how far a single NADN instance can go. We used VUS, or virtual users. Think of them as bots or users repeatedly triggering the same web hook for 2 minutes straight. We scaled from three to 200 VUS to see where the cracks would form, measuring not just speed but stability under pressure. First, on a C5 large instance, that's one virtual CPU, two threads, 4 gigs of RAM, and 10 GBs of bandwidth, NAN single mode deployment held surprisingly strong. From three to 100 virtual users, we saw consistent 15 requests per second throughput and 200 millisecond response times with 0% failures. That's solid performance for a modest server setup. But at 200 virtual users, things started to wobble. Response time shot up to 12 seconds and we logged a 1% failure rate. Not catastrophic, but a clear sign we were hitting the ceiling of what a singlethreaded setup could handle. Then we enabled Q mode. Nen's more scalable architecture that decouples web hook intake from the workflow executions. Same server type, same test, completely different results. Throughput jumped to 72 requests per second. Latency dropped under 3 seconds and the system handled up to 200 virtual users with zero failures. All without changing the hardware. So what happens if we upgrade the machine? We reran the test on a C5 4x large, 16 virtual CPUs, 32 gigs of RAM in single mode. Requests per second nudged up slightly to 16. 2, and response times improved a bit. But the real leap came in Q mode. We hit a consistent 162 requests per second and held that rate across a full 200 virtual user load. We kept latency below 1. 2 seconds and still saw no failures. That's over 10 times the throughput of our baseline just by scaling vertically and using the right architecture and backend. In the multi-web hook scenario, we simulate 10 distinct workflows, each triggered by its own web hook. Think of it as enterprise multitasking. This is closer to what production looks like for many realworld N8 deployments. On the C5 large in single mode, performance degraded quickly. At 50 virtual users, response time spiked past 14 seconds with an 11% failure rate. At 100 virtual users, it jumped to 24 seconds with a 21% failure rate. And by 200 virtual

Segment 3 (10:00 - 13:00)

users, the system had a 38% failure rate with a 34 second response time. Not ideal. Switching to Q mode, however, brought performance back under control. It sustains 74 requests per second from 3 to 200 virtual users with latency staying within acceptable ranges and 0% failure. Same hardware type, radically different results. But once again, the C5 4x large redefined what was possible. Single mode still struggled, peaking at 23 requests per second with a 31% failure rate. But in Q mode, we hit 162 requests per second and maintained that across all loads and still had a 0% failure rate. Latency was held to about 5. 8 seconds even under maximum pressure. Clearly, multitasking needs more muscle. Now, for the binary data benchmark, the most punishing test in the suite. These workflows handle large file uploads, images, PDFs, media, the kind of stuff that eats RAM and disc storage for breakfast. On the C5 large in single mode, we were already struggling at the low end. At three virtual users, we saw just three requests per second. At 200 virtual users, response times ballooned and 74% of the requests failed. That's not just slowdown. That's operational failure. Q mode bought us more time. Failures started later in the load curve, but at 200 virtual users, it still collapsed, peaking with an 87% failure rate and incomplete payloads. Then we fired up the C54X large and the difference was night and day. In single mode, we reached 4. 6 requests per second, trimmed response times by a third, and dropped the failure rate from 74% to just 11%. Not perfect, but vastly improved. In Q mode, we peaked at 5. 2 requests per second and more importantly held 0% failure across the entire load test. That means that every large file was received, processed, and responded to successfully. It's not just about architecture. Binary heavy flows need serious memory, CPU, and disc throughput. What do all these tests tell us? Number one, Q mode isn't optional. It's your first lever for real scale. Even on entry-level hardware, it delivers huge performance boosts with minimal configuration. Number two, hardware matters. Upgrading to a C54X large can more than double throughput, cut latency in half, and reduce or eliminate failure rates entirely. Number three, binary data breaks everything unless you prepare. That means more RAM, faster disc, shared storage like S3, and multiple workers that can handle parallel input and output. If you're building automations that power internal teams, back-end systems, or customer-f facing apps, don't wait for bottlenecks to force an upgrade. Plan for growth from the start. Use Q mode to separate ingestion and processing. Scale horizontally with workers for concurrency and size your hardware based on the nature of your flows. Simple triggers are light, but binary data and multitasking demand more. N8 is built to scale, but like any powerful engine, it needs the right fuel and the right track to reach its full potential. And that's a wrap. We've seen some impressive performances and maybe a few surprises along the way. Remember, the key to a robust system is preparation and testing. So don't shy away from pushing to the limit. If you've enjoyed this deep dive into stress testing, hit that like button and subscribe for more of these tech adventures. And don't forget to leave a comment of what you want to see next. Until then, keep exploring, keep testing, stay curious.

Другие видео автора — n8n

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник