Google Gemini Agentic Vision Tutorial - How To Use Google Gemini Agentic Vision
5:19

Google Gemini Agentic Vision Tutorial - How To Use Google Gemini Agentic Vision

TheAIGRID 04.02.2026 4 406 просмотров 143 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
Checkout Free Community: - https://www.skool.com/theaigridcommunity 🐤 Follow Me on Twitter https://twitter.com/TheAiGrid 🌐 Intersted In AI Business: https://www.youtube.com/@TheAIGRIDAcademy Links From Todays Video: https://blog.google/innovation-and-ai/technology/developers-tools/agentic-vision-gemini-3-flash/ https://aistudio.google.com/apps/bundled/gemini_visual_thinking?e=0&showPreview=true&showAssistant=true&fullscreenApplet=true Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos. Was there anything i missed? (For Business Enquiries) contact@theaigrid.com Music Used LEMMiNO - Cipher https://www.youtube.com/watch?v=b0q5PR1xpA0 CC BY-SA 4.0 LEMMiNO - Encounters https://www.youtube.com/watch?v=xdwWCl_5x2s #LLM #Largelanguagemodel #chatgpt #AI #ArtificialIntelligence #MachineLearning #DeepLearning #NeuralNetworks #Robotics #DataScience

Оглавление (2 сегментов)

Segment 1 (00:00 - 05:00)

So Google just released Gemini 3 aentic vision and this is essentially the new frontier and AI capability for vision models. This is largely one of the areas of AI development that has been a little bit slow but Gemini 3 flash aentic vision essentially covers the bound where AI isn't that good at vision. So I'm going to show you guys how you can use this as a beginner and let's dive right into it. So, if you come on over to this website called Gemini Chat with Aentic Vision, this is going to be how you experience Aentic Vision in Google Gemini because this is by far the most effective way to use the software. Now, of course, you can use the demo on the website, but if you do want to play with this yourself, make sure that you go on tools and then enable the code execution feature. And this is where you're going to be able to use code to solve complex tasks. So, once you've done that, and make sure as well on the right hand side that you have the Gemini 3 flash preview. This is the model with a gentic vision. Every other model doesn't have it. Once you're able to have this, this is where you can then go ahead play with the model. But for now, we're going to use the demo AI studio to actually return images. Now, when you get onto this website, you'll see nine different examples of ways to analyze images. Now, this is a lot better than the standard version of Google Gemini's image model. And I'm not sure if they're going to add this into the standard chat function because the way you're able to analyze images is completely different. Let's say for example this demo where it's able to crop out every single animal then use them all as icons in a mat lip plot then showing the lifespan of those animals. Now essentially what this means is that Google Gemini is using its agentic capabilities to analyze the image dice the image cut out every single one of these and then put that into a chart. You have to understand that this is something that not every AI is capable of doing. In fact, I would argue that Gemini is probably the only able a able to do this. And one of the most remarkable things of this, Gemini is able to basically cut out the image and do all of the calculations on every single individual creature. And not only that, it's able to do it in a relatively quick time. You can see here it says, "I've successfully extracted all 39 animals from the image and created a bar plot showing the typical lifespans sorted from the shortest to the longest. If you asked any other AI to do this, it would simply take far too long. " Now, you might be wondering, well, this is cool and all, but what are the use cases? Well, it's pretty clear. If you have a image that is super detailed with a lot of different things in it, you would ask Google Jonai to take that image, decompose that image, and then present it into some structured bar chart or some kind of data that you would easily be able to see. This is one of the key use cases for this that most people are missing. Now, another example of Google Gemini's agentic vision at work is the ability to annotate images. Most AIs are static. They analyze the image and then they don't do more than that. But with Google Gemini's Aentic Vision, this prompt says annotate the image with the different colors, which object should go into which bin. And in a moment, you'll see arrows pointing to the different objects that showcase which bin it should go into. Once again, this is pretty exceptional because most AIs aren't able to reason and use code to be able to actually draw on an image in a way that grants you your answer. Another example here is make a bar chart of per category performance. Normalize the data. Normalize prior state-of-the-art as one for each task and then take an average per category. Plot using matalib with nice style. And then here you can see we have the final outputs which are super accurate. And you might think that yes certain versions of chatbt or Gemini able to do this but you have to understand one of the key things that we're looking for here is accuracy. If you need super duper accurate image analysis this is going to be the tool for you. So, for example, one key thing that I've always wanted to do because sometimes I like to study financial charts is analyze the swing lows and swing highs. So, I've actually said Gemini with Aentic Vision to analyze the image and place an arrow at the swing high and swing low. And you can see right here that it's able to analyze the image, place an arrow at the swing high and an arrow at the swing low. Stuff like this is super important for traders who are trying to do many different things. Maybe you don't have the time to mark out the swing highs and the swing lows. Obviously, there are certain algorithms that you can use to do that. But this just shows you the very basis of what image analysis is able to do when you have software that is actually accurate. Now, one of the last examples I really do want to show you here is advanced reasoning. Because essentially what this is able to do here is reason in this image about the possible issues that it identifies. And I think that this is super useful because sometimes you might take a picture of something. Maybe it might be at work. Maybe you might be looking at a box of electronics where you don't really know exactly what's going on. you don't know which part is which and Gemini Live simply can't help for you. This is going to be a tool that really does help you in those specific cases where you need a lot of specific information. And so here it's basically able to tell you that although both rulers are measured in centimeters, one is clearly wrong. And so like I said, if you're really trying to analyze if something's wrong, maybe you're trying to measure something, this is going to be a super useful tool. You can even see here where you're trying to analyze a specific chip, zoom and rotate and crop and figure out what numbers on this specific chip. It's able to zoom, rotate, and crop, and get you all of the information you need. So, if you enjoyed this video and you want to use this

Segment 2 (05:00 - 05:00)

tool, like I said in the beginning, don't forget to go on over to the Google AI Studio, enable Gemini 3/ Preview, enable code execution, and then when you have issues with your standard images and standard Google Gemini, don't forget to come on over to here to test whatever it is that you need, cuz it's quite likely that this recent Google update has solved it. As always, if you enjoyed the video, I'll see you on the next

Другие видео автора — TheAIGRID

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник