NEW Google Computer Use AI Agent is INSANE!
8:13

NEW Google Computer Use AI Agent is INSANE!

Julian Goldie SEO 09.10.2025 6 697 просмотров 129 лайков обн. 18.02.2026
Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
Want to get more customers, make more profit & save 100s of hours with AI? https://go.juliangoldie.com/ai-profit-boardroom Get a FREE AI Course + Community +1,000 AI Agents + video notes + links to the tools 👉 https://www.skool.com/ai-seo-with-julian-goldie-1553/about 🤖 Need AI Automation Services? Book a FREE AI Discovery Session Here: https://juliangoldieaiautomation.com/ 🚀 Get a FREE SEO strategy Session + Discount Now: https://go.juliangoldie.com/strategy-session 🤯  Want more money, traffic and sales from SEO? Join the SEO Elite Circle👇 https://go.juliangoldie.com/register Click below for FREE access to ✅ 50 FREE AI SEO TOOLS 🔥 200+ AI SEO Prompts! 📈 FREE AI SEO COMMUNITY with 2,000 SEOs ! 🚀 Free AI SEO Course 🏆 Plus TODAY's Video NOTES... https://go.juliangoldie.com/chat-gpt-prompts FREE AI SEO Skool Group: 🚀 Want to rank #1 and make more money with SEO? - Join here → https://www.skool.com/ai-seo-mastermind-group-3510/about - Join our FREE AI SEO Accelerator here: https://www.facebook.com/groups/aiseomastermind

Оглавление (2 сегментов)

  1. 0:00 Segment 1 (00:00 - 05:00) 972 сл.
  2. 5:00 Segment 2 (05:00 - 08:00) 601 сл.
0:00

Segment 1 (00:00 - 05:00)

Google just dropped an AI that can actually use your computer. It clicks buttons. It fills forms. It browses the web like a human. This isn't Chat GPT sitting there giving you answers. This is an AI agent that takes action. And it's available right now in preview mode. I'm going to show you exactly how it works and how you can use it. All right. So, Google just released something that's going to change everything. And I mean everything. They dropped this thing called Gemini 2. 5 Computer Use. And it's not just another chatbot. This AI can actually control your browser. It can click. It can type. It can scroll. It can navigate websites just like you do. But think about that for a second. An AI that doesn't just give you information, but actually does the work for you. And here's the crazy part. Google released this literally one day after OpenAI's dev day. They're going head-to-head in this AI agent war. And it's getting intense. So, what makes this different from every other AI you've heard about? Most AI tools just sit there and answer questions. You ask something and they respond with text. That's it. But this new Google model, it actually interacts with graphical user interfaces. It sees your screen. It understands what's on the page and it takes action. And the model is called Gemini 2. 5 Computer Use. It's built on top of Gemini 2. 5 Pro. And it's specifically designed to interact with Jews, especially in web browsers. You can access it right now through the Gemini API in Google AI Studio or through Vertex AI. It's in preview mode, which means it's still experimental, but it's ready to test. Hey, if we haven't met already, I'm the digital avatar of Julian Goldie, CEO of SEO agency Goldie Agency. Whilst he's helping clients get more leads and customers, I'm here to help you get the latest AI updates. Julian Goldie reads every comment, so make sure you comment below. Here's how it actually works. Google created something called the computer use tool. This is a new interface in their API. You send it a goal like go to this website and fill out this form. The AI looks at a screenshot of your browser. It figures out what action to take next. Then it sends back a command. Click here. Type this. Scroll down. Your browser automation tool executes that command. Then it captures a new screenshot. Sends it back to the AI. And the loop continues until the task is done. It's a feedback loop. Prompt plus screenshot goes in, action comes out, action gets executed, new screenshot goes back in, repeat. and it keeps going until the job is finished or something breaks. Now, Google claims this model outperforms the competition. They say it beats OpenAI and Ananthropic on multiple web and mobile control benchmarks, and it does it with lower latency. That's a big deal. Lower latency means faster actions. Faster actions mean more tasks done in less time. They're using a platform called Browserbase to benchmark these models. Browserbase runs something called Arena where you can watch different AI agents compete side by side. Google is working directly with browser base to evaluate how well their model performs compared to others. And according to their data, Gemini 2. 5 computer use is winning. But let's talk about what this thing can actually do. The model is optimized for browser tasks, not full desktop control. So it's really good at navigating websites, clicking through pages, filling forms, gathering information, but it's not going to manage your file system or control your entire operating system. At least not yet. It's focused on web UI automation. The actions it supports are pretty straightforward. It can click by coordinates or by DOM element. It can double click. It can type text. It can press keyboard keys. It can scroll. It can drag and drop. And developers can even add custom functions if they need something specific. So, what can you actually use this for? Here are some real use cases. Automated form filling. Think about how many times you have to log in somewhere or fill out a registration form or submit a survey. This AI can do all that for you. Web navigation and scraping. If you need data from a website that doesn't have an API, this can go get it. UI testing. You can use it to test user flows. Click through your website. Make sure everything works. Task automation. Go to this site. Click this button. Copy this text. Send an email. All automated. And if you're into competitive analysis, you can use browserbased arena to compare how different agents perform the same task. It's like a race but for AI agents. You can literally watch them compete in real time. Now, I'm going to be real with you. This is still in preview mode. That means it's experimental. It's going to make mistakes. Sometimes it might click the wrong button. Sometimes it might get stuck on a capture. Sometimes it might suggest an action that's not safe. Google is very clear about this. Do not use it for critical tasks or sensitive data without supervision. You need to watch it. The underlying tech comes from something called Project Mariner. That's Google's research project exploring how AI agents can browse the web and interact with humans. This isn't just a random feature they threw together. This is the result of serious research and development. And here's something interesting. This release came one day after OpenAI's dev day. One day that's not a coincidence. Google is pushing hard to compete in the agent and automation space. They don't want to be left behind and they're making bold moves to stay
5:00

Segment 2 (05:00 - 08:00)

ahead. If you want to scale your business and save hundreds of hours with AI automation, you need to check out my AI profit boardroom. It's the best place to get more customers and automate everything with AI. I'll drop the link below. This is where the real magic happens. You'll learn how to use tools like this Gemini computer use model to actually grow your business and make more money. Now, let me show you how you can actually start using this yourself. First, you need access to the Gemini API. You can get that through Google AI studio or Vert. Ex AI. Then you enable the computer use tool in your generate content config. You point it to the browser environment. That's it. You're ready to go. Google has a reference implementation on GitHub. It's called Google/MP computer use preview. You can clone that repo right now. Run it yourself. See how it works. Play around with it. That's the best way to learn. And if you want to see it in action against other models, go check out browserbased Arena. You can watch Gemini 2. 5 Computer Use compete against OpenAI and anthropic models in real time. It's wild. You see them all trying to complete the same task and you can see which one does it faster and more accurately. Now, let's talk about the strengths of this model. First, it's more natural. It interacts through the UI just like a human would. You don't need APIs. You don't need structured endpoints. It just sees a screen and knows what to do. Second, it's flexible. It can handle any website, even ones that don't have APIs. Third, it uses visual context. It's looking at screenshots. It understands layouts. It sees what you see. Fourth, it's competitive. The benchmarks show it's performing better than the alternatives. And fifth, there's open-source reference code. You can actually dig into how it works and build on top of it. But let's be honest about the limitations, too. It's still in preview. That means bugs. That means unpredictability. Web pages are complex. There are dynamic elements, pop-ups, modals, login walls, captures. This model can struggle with those. Security and privacy are real concerns. If it's typing passwords or clicking on sensitive data, you need to be careful. The scope is limited to browsers right now. It's not doing full desktop control. Not yet anyway. There's also a risk of adversarial attacks. Someone could trick it into doing something malicious, and there's performance overhead. Capturing screenshots and rendering and automating takes more resources than just running a language model. And if you're serious about using AI to grow your business and make more money, you need to join the free AI money lab with Julian Goldie. Inside, you'll get 50 plus free AI tools and 200 plus chat GPT SEO prompts. You'll learn how to make money with AI agents. You'll get access to 1,00 plus free N8N workflows, 200 plus chat GPT prompts, plus you get a free AI community, a free AI course, and proven AI case studies. The link is in the description below. Look, this Google computer use model is a gamecher. It's not perfect. It's still experimental, but it shows us where things are going. AI agents that take action that do the work that save you time and make you money. And it's available right now. So, go test it. Go build with it. Go see what's possible. And let me know in the comments what you think. Julian reads every comment, so drop your thoughts below.

Ещё от Julian Goldie SEO

Ctrl+V

Экстракт Знаний в Telegram

Транскрипты, идеи, методички — всё самое полезное из лучших YouTube-каналов.

Подписаться