3 AI Agent Browser Automation Challenges That Keep Getting Harder
16:47

3 AI Agent Browser Automation Challenges That Keep Getting Harder

All About AI 08.03.2026 2 238 просмотров 49 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
3 AI Agent Browser Automation Challenges That Keep Getting Harder 👊 Become a YouTube Member to Support Me: https://www.youtube.com/c/AllAboutAI/join For Agents: www.skillsmd.store My AI Video Course: https://www.theaivideocourse.com/ 🔥Open GH: https://github.com/AllAboutAI-YT/ Business Inquiries: kbfseo@gmail.com

Оглавление (4 сегментов)

Segment 1 (00:00 - 05:00)

Okay, so today I think we have a pretty interesting challenge. Uh I wanted to figure out some kind of challenge for our browser automation using our cloud code agent. So I thought what is the most difficult UI I kind of know to navigate and the first thing I thought of was of course AWS. If you have seen their UI, it's very complicated if you're not uh very yeah well established in their console system. So I kind of came up with three different challenges. I want to see if our AI browser agent can solve. So you can see we have level one here. The challenge is to create an S3 bucket, upload an image, launch a static web page that displays the image and some kind of text here. Uh [snorts] on level two, we have launch a Linux uh VM, make it accessible with a graphical remote desktop, get it online and use its browser to open a YouTube video about cloud code. I don't know if that's possible, but we'll see. And level three is uh build and publish a small web app where the user can upload a video. Uh and the app then displays a public page where the uploaded video can be played back. So basically like a small uh YouTube, right? So these are the three challenges. So what we're going to do now is head over to the Mac Mini. I'm going to set everything up. I have created like an AWS account and from there we're just going to see if it can deal with all of these three levels of challenge. So yeah, let's just head over to the Mac Mini. So if you haven't seen any of my previous browser automation videos, I'm going to try to explain quickly what we are actually using. So we are running cloud code that has this Chrome automation CLI setup. Uh that gives it the opportunity for any kind of AI coding agent to control Chrome, open pages, navigate around using kind of the Chrome debugger with the uh Chrome developer protocol. So that is what we are using and you'll kind of see it in action as we go on here. So basically now we're just going to clear up this because this is the browser we're going to use. So I guess we can just go to AWS here and let's see if we are signed into the console. Yeah, you can see this. I did some tests here and it seems to be working. So that's pretty good. So we are logged in. That's good to see. So if you go back to cursor now, you can see here is our challenges using only the ABS console in the browser. This is going to be level one, right? So, we're going to try to give it this challenge first. So, I'm just going to open up like a new fresh cloud code instant here that is in our browser automated uh setup. And let's just paste in the challenge here. And I'm just going to say uh you should be logged in on console when you navigate there. So, I just closed down it. So, let's just start this now. And I'm going to use kind of the evolutional principle here that it's just going to keep trying. Hopefully, it figures out some way to do this. So, I want to see now if it's just going to open up the AWS console first and see. Yeah, you can see it went straight to S3. That's pretty good. So, I kind of want to follow along here for a while, but if it's going to take very long time, we're just going to shut down the camera and do some screen recording. But, uh, you can see it's already on the S3 page here. That's pretty good. And you can see we have the create bucket here. Yeah, it's navigating to that pretty easy so far. And now it's going to probably have to do like a bucket name here, I guess. Yeah, let me type in the bucket name first. And that's going to be maybe I can zoom in a bit here. Let's see. Yeah, that's a bit better. So, we can see we have the bucket name here. So, it's probably going to figure out a way to just use the Chrome setup we have uh to actually just put in like a name here. Okay, so it put in EJ Oslo site 2026. It's just keep scrolling down now. Maybe it's looking for the create bucket, I guess. Yeah, it scrolled all the way down. And now it probably just have to click on create bucket. Yeah, and that's it. Let me move this thing here. So you can see we created the bucket. Here we have it. Perfect. So I want to see now if it kind of moves on to the next part and that is going to be to upload. Yeah, we are on the upload part now. So here is going to be upload and we gave it that me. png image. Okay. So you can see it put in me. png and osloind index. html. That's looking pretty good. It added a new file index. html. Okay. But now it removed the image too. Yeah, it said So now it wants to add the image back. Okay. Yeah, that was done. Yeah, it's doing a screenshot to confirm and upload. Okay, so you can see now we have our two we have the image file and we have the text file, the HTML index. html. Now go to the properties and we want to set some uh static website hosting, I

Segment 2 (05:00 - 10:00)

guess. Okay. So, it says it found the static website edit button. Click it. Okay. And now it needs to select uh probably enable website hosting. Yes. So, it's going to type index. html in this document here, I think. And it's saved. Did we get like a URL for this? I think it's looking at actually doing some uh public access settings here now because we didn't do that. Yeah, it found it. So, it uncheck block public access and probably just has to save this now. And here we have like a confirmation. Uh so now it needs to type in confirm. Okay. So now you can see this is off. So that's pretty good. So then we can kind of move on I guess. Okay. So this was pretty interesting. It having some issues editing the bucket policy. So now it's going to try to install as cloud shell uh to try to do it that way. So I think that's a pretty interesting solution. Instead of just bashing your head at the same thing, try something else. And I have a video coming up on that. I think that's going to be pretty interesting. And now we are in the shell, right? So let's see what commands is going to put in here. Now, this is a bit cheating, but I guess it's fine. Okay, so it's going to do the CLI command here. Yeah, you can see it here. Okay, so it looks pretty happy with the command it used in CloudShell. And yeah, that looks pretty good to be honest. And if we go to this here now, yeah, we got it. I guess it's not secure, but uh me and Oslo, we have the page and the image. So, yeah, very happy about this. I would say it passed this. Uh, it spent some time though. It spent 40 minutes. So, that was not optimal. But we're going to do one thing now. So, I'm going to say good job to save time and learnings. So, next time you are on W AWS, you will save time navigating. So, this is kind of how we train our browser agent to be more effective the next time we're going to use it, right? So, I'm just going to create this skill and then we're going to move on to challenge number two. Okay. So, now that we have created our skill, try not to navigate, right? Uh I think we're just going to clear the window the context and we're going to start with the read. Uh I guess we can just to be sure just do a full restart here. And let's just read the AWS skill here first and then we're going to give the second challenge. So, I'm going to write up that and I kind of end up uh you did pass the challenge level one. Now, here is challenge two and we put in like create the AWS console free Linux Liam blah blah. Uh good luck completing a challenge. So, basically it's going to try to launch a free Linux virtual machine. Uh get some graphical remote desktop, get it online, and use the browser to play a YouTube video. I don't know if this is possible, but we're going to give it a um a chance. Okay. So, I'm going to kick this off. Uh, but this time I'm not going to stay here and focus because this is like a long running uh task. Right. So, I'm just going to turn off the camera. I'm going to keep recording and I'm just going to play you back and then we're going to look at kind of how it solved this uh or what the results we ended up with. Right. So, like I said, uh let's see if it fires up the browser here. Yeah, we go straight to launch an instance. Okay. So, I'm just going to turn off the camera now and I'm going to just record on the Mac Mini and let's see what we end up with at the end here. Okay. So, you can see now we have actually launched the V VM. Okay. So, it's going to be interesting. There are some credentials here we need to enter. So, like a password. I'm not quite sure if it can do that. If not, I'm gonna pro probably do it manually. Okay, I think we are in. So, you can see now we have launched the Ubuntu um instance here. So, that's pretty cool. So, I don't think we are online though. But let's see how it's going to navigate this now. You can clearly see we have the Ubuntu VM up here. Okay. So, it says it wants to use the cloud shell to launch Firefox. Okay. Yeah, it wants to launch with Claude Code plus anthropic on Firefox. Okay. So, you can see it's going to a YouTube video, but uh it doesn't look like there's any connection here. So, uh I guess we could wait a while, but uh

Segment 3 (10:00 - 15:00)

if it doesn't work, I think we're just going to call it there. But because I'm super happy anyway how it turned out. I guess we only had like a small issue at the end here. There might be some even maybe even Okay, we are loading. So that is pretty cool. That is pretty crazy. I wasn't really expecting us to actually be able to do this just by using the browser automation. I guess that's just shows how powerful this has become now. So let's give it a few minutes and see if this loads. But uh I don't think it's going to do. It doesn't see if it has enough memory for this. No, it doesn't look like anything is happening, but uh I got to give this a pass because we did actually create the instance. We launched the VM. We set everything up and we actually went to the VM in like a virtual non- headless mode and we did put in like a YouTube address and it did render but not 100%. But yeah, I got to give this a pass. Now, let's do the final level three challenge. Okay, so now of course we're just going to do you did pass level one and two. Here's the challenge for level three and it's going to be using only the ABS console in browser build and publish a small web app where the user can upload a video and the displays a public page where that uploaded video can be played back. So again um I'm just going to say uh yeah we repeat this good luck completing a challenge use the AS skill to navigate and as previous that is all I'm going to do. I'm going to remove the old things here first. clean this up. And again, I'm going to turn off the camera and I'm just going to let this run and I'm going come back if we are looking that we are actually going to complete this. Okay, so far it's just I think it cheated a bit there because it only went to the cloud shell even though I told it to only use the browser automation or use the browser. So I went and did something else and I came back and I see it's just in the cloud shell. But I'm going to give it a pass. And I think it's actually closing in on the end here now. And I had a quick look at what it has done so far. And it's been super fast. I think I spent like 3 4 minutes so far actually doing this. So I want to see how close we are now. Uh we're going to do some HTML and stuff now for the front end. And from there now I think we are pretty close actually to actually launching the app and we can actually test it out. You can see we did the HTML now with the CSS and we can actually test it out. upload a file and the user can actually yeah we're uploading the index now and the user can play it back. So it's going to be interesting to see if it actually works. Okay, so here you can see this is the app. So with share upload a video get a public playback page. Do I have any videos here? I might do that. So let me find a video and try to upload it. Yeah, I have like an old introduction video here to my AI agent I tested for a long time ago. So, let me just drag this in here. Okay, so it failed to fetch. That's interesting. Let's see if we can actually log this sent log. So, it did an install log here. That's pretty good. Okay, so here we have it again. So, try let's try now. So, let's do the same video and hopefully. Okay, so it's uploading. And here we have it. So, that's pretty cool. We get a direct link. So, what I'm going to do now is I'm going to head over to this PC here or this MacBook here. And I'm actually going to go to that link and see if we can play it. Okay. So, here is the link. Right. So let me copy this and let's see now if not there if we actually can see the video. Yeah, it is loading. Can we play it? — Hi guys, hope you are doing. — Yeah. And let's see if we can upload. So now I want to do a bit let's do a 200 megabyte file here. So I'm just going to drag this here. Okay. And yeah, it is uploading. That's pretty cool. So, I'm going to switch back now to the other to the Mac Mini and see if the video pops up there. Okay. So, let's just refresh this. And we have it. Does it play? — Yeah, that works. So, [snorts] now we kind of have our kind of shared video platform where users can upload videos. So, yeah. Uh, I think it actually did cheat a bit uh because it kind of only leveraged running the cloud uh shell, but I'm going to give it a pass because I didn't wasn't

Segment 4 (15:00 - 16:00)

here to watch it, so I didn't really pick up on it. But other than that, how easy was that? I created this video upload platform that is kind of online in like 5 10 minutes just by using cloud code and the agent. We didn't really get to test the browser part here but other than that I was super happy. So I would definitely if we go back to the challenges the first one was of course slow because we actually used the console for everything. Uh I think uh and the two other one was a bit cheating because we used the cloud uh we didn't use the cloud shell in the beginning of level two but in level three we only used it. So uh but overall I think this just kind of shows how powerful these agents are getting now and especially claw code and these codeex CLIs how many things they can do if you only have the tools available. So, I'm going to do a video soon about some kind of theory I have. It has something to do with like evolution and yeah, some kind of fun theory I have about how these systems are kind of evolving. So, that's going to be later. So, I hope you enjoyed this kind of this threepart challenge. And again, I think it's passed pretty flawlessly, like I said. So, if you like content like this and you want to try see more of this, just give this video a like. Uh, also if you want to skills, let me see skills md. store. If you want to get kind of the setup I have, you can just go to info and read here if this is something you think you need. So, I've done a few videos on this in my back catalog if you want to watch it. So, yeah, thank you for tuning in. Hope you found it interesting and I'll see you again

Другие видео автора — All About AI

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник