Join my AI Academy - https://www.skool.com/postagiprepardness
🐤 Follow Me on Twitter https://twitter.com/TheAiGrid
🌐 Checkout My website - https://theaigrid.com/
Links From Todays Video:
https://www.aboutamazon.com/news/innovation-at-amazon/amazon-nova-website-sdk
Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos.
Was there anything i missed?
(For Business Enquiries) contact@theaigrid.com
Music Used
LEMMiNO - Cipher
https://www.youtube.com/watch?v=b0q5PR1xpA0
CC BY-SA 4.0
LEMMiNO - Encounters
https://www.youtube.com/watch?v=xdwWCl_5x2s
#LLM #Largelanguagemodel #chatgpt
#AI
#ArtificialIntelligence
#MachineLearning
#DeepLearning
#NeuralNetworks
#Robotics
#DataScience
Оглавление (2 сегментов)
Segment 1 (00:00 - 05:00)
In the world where web AI agents are probably the future, a new contender has emerged in the race for truly autonomous digital assistance. Following OpenAI's operator and anthropics computer use capabilities, this latest innovation joins the growing ecosystem of AI systems designed to take control of web browsers and perform tasks independently on behalf of users. As these companies are pushing the boundaries of what's possible with agent-based AI technology, we're actually witnessing an acceleration. All of these systems that can navigate websites, complete purchases, schedule appointments with unprecedented accuracy developed by the AI researchers at Amazon. This is Amazon Nova, a groundbreaking new AI agent that can browse the web autonomously. Now, this AI agent just didn't come out of anywhere. This AI agent called Amazon Nova Act was actually based on the Amazon Nova ecosystem. You see, Amazon decided to step up their game and release some AI models of their own. So, let's actually take a deep dive onto how all of this works because I'm sure you're eager to learn how Amazon is stepping up their AI game. All right, so in this first video, what you're going to see is that instead of generating text or answering questions, Nova Act gives AI the power to actually use these websites. And in this demo, they so searching for apartments, clicking buttons, and even doing things like calculating commute time or organizing data. It works by breaking big tasks into smaller steps that AI can handle more reliably. And it's designed for developers to easily build with Python. So, if you're curious about AI is going to start doing stuff for us online, this is the glimpse into the future. Soon, there will be more AI agents than people browsing the web, doing tasks on our behalf. That's why we built Nova Act, an SDK designed for developers to build and deploy web agents that actually work. Look, it's not going to be too long until these agents can land spacecraft, but we're not there yet in reliability. Nova Act meets the models where they are by allowing developers to break down complex jobs into clear steps that the model can follow, giving you granular control without the babysitting. Let's see Nova in action. Guiding the model is as easy as making an act call, which translates natural language into actions on the screen. You can chain multiple app calls together to construct increasingly complicated workflows. This blockwise approach makes workflows more consistent, accurate, and reliable. In this example, we'll use Nova Act to find our dream apartment. We're searching for a two-bedroom, one bath in Redwood City. Here, we've given our first act call to the agent. It's going to break down how to complete this task, considering the outcome of each step as it plans the next one. Behind the scenes, this is all powered by a specialized version of Amazon Nova, trained for high reliability on UI tasks. We designed the SDK to integrate seamlessly with all your favorite Python tools and libraries, making it easier to do cool stuff. All right, we see a bunch of rentals on the screen. So, let's grab them using a structured extract. We'll define a pidantic class and ask the agent to return JSON matching that schema. For my commute, I want to know the biking distance to the nearest Cal Train station for each of these results. Let's define a helper function. Add biking distance will take in an apartment and then use Google Maps to calculate the distance. Now, I don't want to wait for each of these searches to complete one by one. So, let's do this in parallel. Since this is Python, we can just use a thread pool to spin up multiple browsers, one for each address. Finally, I'll use pandas to turn all these results into a table and sort by biking time to the Cal Train station. We've checked the script into the samples folder of our GitHub repo. So, feel free to give it a try. And this is only one example. The SDK is yours to explore. So, dive in and see what's possible with Nova Act. We're really excited to see what you'll build. Okay, so there is also another video that I want you to see and how Amazon's Nova actually works under the hood. Not just what it does, but how it does it. The focus on here is reliability. They're showing how Nova Act can break down tasks into simple small steps like clicking a button, then choosing a date or typing into a field just like a real person would when using the app. Now, these little steps are called the building blocks of AI agents. And once the model can do these reliably, you can then combine them to automate more complex workflows like requesting time off or setting up auto replies. So if you're interested in seeing how it works under the scenes, this is the video for you. An agent isn't very useful if it only works some of the time. We're focused on making Nova Act reliable at executing the building blocks that make up workflows. We're teaching our agent to have the same intuitions about screens that we have. This means intuitively interacting with UI elements like icons, forms, search fields, date pickers, and drop
Segment 2 (05:00 - 09:00)
down menus. Let's see an example of how to combine these building blocks. Here's a workflow where my colleague requested some time off by stringing together a few simple ACT commands. First, Nova Act will set up a calendar hole. It breaks this task down step by step, typing into open text fields, picking the right date, and selecting from drop downs. Each step starts with a thought. The thought considers what it sees on the screen along with the best next step to accomplish the goal. Then based on that thought, it takes an action. Now that Nova Act completed the first task, it'll set up an automatic email reply. Notice the same kinds of UI elements and atomic actions. These make up the foundation for navigating the entire web and using any kind of software, which is why we've focused on training our agent to be reliable on these building blocks. Finally, Nova Act will submit a leave request. In this case, we've already logged in with our credentials so we can submit the request seamlessly. Here, we're seeing a third context where the same basic ingredients can be recombined to complete our task. This was just one of the many routine workflows that we think Nova Act can help with on a daily basis. These can be simple things like booking meeting rooms or more complex things like filing expense reports. These tasks add up. So by using Nova Act to take care of them, you can free yourself to focus on the things that really matter. Video you're going to see one of the most powerful and honestly underrated parts of Amazon Nova's act. Scheduling AI agents to run on their own. This is about real AI automation. Once you've built a task for your agent, like ordering food or filling out a form, you can set it to run automatically on a schedule. You don't have to babysit. There's no manual triggering. The example they show is super fun and relatable, getting the same salad delivered every Tuesday, even without touching your computer. They also show how to run it in headless mode, which just means how it works behind the scenes without showing anything on screen. If you have to babysit your automation, it's not really an automation. That's why reliability is the core of everything that we have built. Once you got your workflow up and running, you can easily switch out the headless mode and even set it to run at your own schedule. Let's take a look at how simple and powerful this can be. I get the same salad delivered every Tuesday night. So, I wanted to see if NoA could make this any easier. I put together a workflow and used a cron job to run it on a schedule. I love the shumami, so I instructed the model to search for it and add it to my cart. The agent scrolls to find the right bowl, adds it to my bag, and even does the check out for me. And just like that, my dinner arrives at my doorstep right on time without me lifting a finger. That was the on-screen version. But switching to a headless mode is as simple as flipping a switch. Check out other version of a script for yourself on our GitHub repo and explore other ways NoA can work for you. Next, let's take a look at a fun example that you can use your Amazon Nova AI agent for. We've imagined a lot of possibilities for Nova Act, from booking appointments to automating QA testing. Now, we're excited to see what you can build and all the crazy, goofy, out of the box things you can imagine. When I was trying out Nova Act, I wanted to do something silly. So, I found this fun pigeon battling game, opened up interactive mode, and quickly wrote a few sentences instructing the agent to assign stats to our pigeon, then go battle other pigeons. Now, we definitely didn't train our model to excel at pigeon tournaments, so I wasn't sure if this would actually work. To my surprise, Nova Axe successfully assigned all the stat points, even adding one point to defense. It then defeated two pigeons in a row and eventually evolved into a huge muscular pigeon. Now it's your turn to explore and we can't wait to see what you create.