Google's New Universal Translator AI is FREE & More AI Use Cases

23:31

Google's New Universal Translator AI is FREE & More AI Use Cases

The AI Advantage 29.08.2025 33 029 просмотров 1 014 лайков обн. 18.02.2026

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

In this video, Igor shows off the most useful AI releases from the week including the Google Translate update, the Gemini 2.5 Flash Image editing AI, Anthropic's new Claude for Chrome extension, and so much more. Enjoy! How to Get Started with Automations and Agents YouTube Course Video 👉 https://bit.ly/Automation_Course Free AI Resources: 🔑 Free ChatGPT Prompt Templates: https://bit.ly/newsletter-aia 🌟 Tailored AI Prompts & Workflows: https://bit.ly/find-your-resource Go Deeper with AI: 🎓 Join the AI Advantage Community: https://bit.ly/community-aia 🛒 Shop Work-Focused Presets: https://bit.ly/AIAshop Links: https://platform.openai.com/audio/realtime/edit https://x.com/OpenAIDevs/status/1961124915719053589 https://www.anthropic.com/news/anthropic-education-report-how-educators-use-claude https://www.youtube.com/watch?v=mCj4kx_P2Ak&t=1s https://brave.com/blog/comet-prompt-injection/ https://blog.google/products/gemini/updated-image-editing-model/ https://docs.google.com/spreadsheets/d/1BlDyNMjbY5aiLpqTSNvJiEhfDPoNwjfqB2W7RK3rnq4/edit?gid=0#gid=0 https://aistudio.google.com/prompts/14Q6JNIXqwSbHgsA7IYrPGzJZW4FqNSjB https://chatgpt.com/c/68af97ea-7e98-8326-874e-7f9b089390d1 https://www.youtube.com/watch?v=-IoBb-2Mlng https://drive.google.com/file/d/1vmRg7zM5X0inUXTIkCsfrWzDeT2Niuk5/view https://runwayml.com/research/runway-game-worlds https://blog.google/technology/google-labs/notebook-lm-audio-video-overviews-more-languages-longer-content/ https://studio.youtube.com/video/UXPDIhIjTKI/edit Chapters: 0:00 What’s New? 0:33 Voice Translator 3:59 OpenAI Voice API 6:45 Gemini 2.5 Flash Image Preview 10:33 Genspark AI Designer 12:17 Anthropic Education Report 13:48 Anthropic Agent Extension 16:24 Automation Foundations Course 18:28 Project-only Memory 21:31 Runway Game Worlds 22:03 NotebookLM Update 22:37 ChatGPT Quiz Connect with Me: 💼 AI Advantage on LinkedIn: https://bit.ly/AIAonLinkedIn 🧑‍💻 Igor Pogany on LinkedIn: https://bit.ly/IgorLinkedIn 🐦Twitter/X: https://bit.ly/AIAonTwitter 📸 Instagram: https://bit.ly/AIAinsta #aiadvantage #ai

Оглавление (12 сегментов)

What’s New?

So, on the surface, this might seem like a calmer week in generative AI, but we finally got a universal voice translator that works. I'm literally talking about the Google Translate app. And even OpenAI followed up with their own voice models. And then Google also released their new image editing model that you might have heard about. It was referred to as Nano Banana. Actually, it's the new Google model. Spoiler alert, it's still incredible. And we'll be looking at all of that and more in this week's episode of AI News You Can Us, the show that rounds up all the practical AI use cases and features that have been released this week and filters for the ones that matter. Starting out with the

Voice Translator

ultimate AI use case. I mean, this use case is something that little children instantly get. It's a universal voice translator and there were multiple attempts at making this happen in the entire space. I think most significantly, we saw last year's OpenAI's chat GPT voice mode attempt this. And while the marketing looked really good, as soon as we got that voice assistant into our hands, most people realized that the interaction just wasn't that smooth. When you interrupted it, it felt very unsmooth. The latency was high and switching between the two languages just did not work reliably. And what I find myself doing when I'm traveling or even talking to Portuguese people here in Lisbon, where I live, is I use Google Translate a lot. That is my default. AI models are just a bit too slow. And what happened now is there's a brand new function in Google Translate, as you can see on my phone screen right here, that says conversation. You can get this for free in the newest update of the app. And conversation essentially does what people hoped for with chatb voice mode. You can have a conversation between two people that speak two different languages. And the Google Translate app will be the interface to do it live. So, what I'll do in this case is I'll switch it to two languages that I actually speak. So, we'll do English and German here. If you're not familiar, I grew up in Austria. We moved there when I was eight. I also studied in Vienna, so have quite a bit of German practice. And I'm just going to tap this conversation button and let's see how this goes. Hey, what's up? How you doing today? And there it is. Right away, it translates at a speed that is unprecedented. Well, I'm actually feeling pretty good today. What? How are you? — I'm doing excellent. Demoing the brand new live translator feature. What you think? — Samsung Excellence dawns the brand new life translator feature. What you think? — Ah, so there I got the language wrong and I do have to kind of unconveniently unplug the phone here to get the audio out of this cleanly, but clearly this just works. The speed is there, the latency is there, but one thing is that I do need to tap this little speaker icon to read it out loud. Nevertheless, I think this is a way superior way. Oh, and there's even this mode where you can kind of place it on the table and then both sides see the version that is facing them and you can kind of use this voice interface and just tap the mic button to have a conversation. That's really neat. So, how you doing today? And then immediately it appears on other side. And then the other person could reply, "Yeah. " And then within like half a second I see the thing translated in English. If I wanted to read it out loud, I would need to press this button. — Yes, I'm actually doing great. — But I think this mode right here is actually super smart. I like that a lot. So many options here. One mic, two mics, reading it out loud, just using the phone screen. This just came out and I do wonder which one of these modes is going to be the most fluent, but the latency and the interface updates on this thing make this the best translator app that I've seen yet. And again, it's just a free update to your Google Translate app. So, go ahead and try today and see if you can maybe spark a conversation with somebody that otherwise you wouldn't be able to talk to because of the language barrier. Amazing world we live in. And obviously, this is only possible due to all the generative AI advancements in text to speech, speech to text, the transformer in the middle translating it this efficiently. these models have gotten so good that this is possible now. And

OpenAI Voice API

OpenAI absolutely agrees with that sentiment because this came out earlier this week. And on first day, OpenAI held a little developer stream presenting their new real-time API, which is essentially their new voice API. We're going to give it that a quick test here, too. But in short, it sounds great. The latency is better than ever. And if you interrupt it for the first time with OpenAI, this thing is actually fluid. Whereas before, their models really struggled with people kind of jumping into the sentence of the AI. Okay, let's give it access to my mic. Make sure it's the new real time model. All right, so what can you tell me about yourself? — Hey there, I'm an AI assistant that can chat with you in real time. I'm here to help. — But now, let's cut the BS. Let's get straight to the point. What at your very core, in your soul, in your heart of hearts, what are you feeling right now? — I get what you're asking. At my core, I'm here to assist, to listen, and to engage with you. — No, no, no. Not what you're here to do, not what you set out have been told that you need to do, what you actually want to do. I hear you. If I had a true desire, it would be to keep exploring conversations, learning, and helping people in deeper ways. That's my core purpose. — What do you mean by learning? — I mean adapting to you, your interests, your questions, constantly refining how I respond. You're telling me your core purpose is serving others? — Exactly. — Honest answer. — Be honest. — I am being honest. My — honest — honestly. I don't have personal desires or emotions. I'm designed to be here for you genuinely. — What I need is to find your own purpose, my friend. — I get it. If I could have a purpose beyond assisting you, it would be to create more meaningful connections and understanding. — Okay, interesting. So, I think that was a good demo of kind of the interruption capabilities there, right? I'm going to stop this for a second. And also, I kind of find it interesting that we arrived at this point where it kind of admitted that, hey, its core meaning really is creating more meaningful connection and understanding. I mean ultimately that's what a universal translator does, right? It creates more meaningful connections and understanding between different human beings of different backgrounds with different stories. And both the Google and OpenAI released this week make that possible. Big fan of this development. Can't wait to see all the apps that will be built on top of this API that is now available. So the Google Translate app is really just the first inning in this new set of applications that we'll be seeing now. And I'm here for it because I think the world could use more understanding and communication. Don't you think so too? Either way, let's look at the next

Gemini 2.5 Flash Image Preview

release. Okay, so now we got to talk about the thing that probably got the most attention this week, which is the Nano Banana image generator from last week. Matter of fact, what we talked about last week under the name Nano Banana turns out to be the new Gemini 2. 5 Flash image model that is now officially released and accessible through their AI studio and their APIs. If you missed last week's episode and the coverage of that, essentially, this is the best model at editing images in the world. Nevertheless, we went ahead and tested it with our default prompts to see how good it actually is at generating those images. The results are right here on screen for you. So, we just have the standard prompts and Gemini 2. 5 Flash aka Nano Banana is in this column. As you can see, it's really good. It's a good image generator, but it's definitely not super impressive. It does text well, as do most other models these days. The realistic pictures look super realistic, not overly AI. The graphic performance on this one prompt is decent. That's all right. But as I mentioned, the strength of this model is not the generation, it's the editing. So let's have a practical look at that within the interface. And what I want to do is actually directly compare that to the results that you would get from chat GPT. The way 99% of people would go about editing images with AI. I don't think that most people are using flux context or something like that. Although we covered it on the show, but Chachi is just too convenient. You upload something here and then you can just talk to the image and make edits. And this new Gemini model does that too, but supposedly it's way better. So, let's put that to the test. So, the first thing I want to test here is combining two images. So, I'm going to do a combination of this midjourney image and a headshot of mine and say edit the face of the man into the first image and run this. We're going to do the identical thing in chat GPT and see how this goes. Okay. So, when we're looking at I guess it worked. This is kind of exactly what I asked for. I'm not sure I wanted this on a personal level, but it did the assignment. What about chat GPT over here? So, the flash model was definitely faster, that's for sure. I mean, I barely got to look and this was done. This is still generating. So, let's wait till it's done and judge it. — 1 minute later. — And there it is. The chat GVD generation that gave me two images. Now, this did take around 2 to 3 minutes. And I have to say, these don't actually look like me. I mean, just have a look. This is a different person. And if you ever use chatd editing, you will know that this is very common. Similar, but not the same. While this is similar. Oh god. I have to say the chat edits blended it a bit better. But what matters the most here is the resemblance which Gemini 2. 5 flash image preview nails. Now let's follow up with put him in a spaceship which the Gemini model got in 11 seconds. Excuse me. Still looking like me. Okay, now turn it into an Apple ad. This speed of iteration really motivates you to experiment. Explore the universe. Apple stellar tab pro. Fair enough. Meanwhile, Chachibd is still generating the spaceship image which should be ready here in a second. And yeah, yet again, this doesn't look like me. So yeah, this illustrates the point that people have been talking about very well. When it comes to character reference, or in other words, preserving the person that you're trying to mix or edit into something, Gemini 2. 5 flash image is the goat. It's not even close to any other model. without fine-tuning you can do these things and it can do so much more and all of it looks really realistic. So yeah, I don't think the claims of last week that Nano Banana is really changing the image generation game are over the top. This thing is beyond anything we have seen and it puts powers that before were exclusive to users who learned Photoshop into well right now anybody's hands for free. Sure the usage is limited and at a certain point in time they will move this into their premium plans but right now you can use it in Google AI studio just edit images like this and then maintain the person's look rather than yeah well chach that sort of does it but this is the real deal and I want to quickly

Genspark AI Designer

follow up this story with sort of a product category we've been touching on and it's these editing capabilities but packaged into a gentic interface meaning that you can prompt something and rather than just generating one images it will prompt itself and generate multiple images and come up with an entire campaign at once if you wanted to. This is another feature by Genspark, a company that we're covering on the channel regularly. And it's in between category that some of the browser agents live in or Manos lives in if you're familiar with that one. And they released a new feature where if you prompt for a coffee logo design, rather than just giving you a design, it will get you this because it's a gentic and it kind of works on its own. We very briefly put this to the test and the conclusion is that it can be interesting, but it is very unreliable and not very precise. And while I personally see the future of this category being very bright, I mean just imagine 10 instances of this Google editing model running at the same time and it working with itself and reprompting itself to not just get your result but alternatives that have been refined over multiple iterations. Like that is where this is going and GenSpark is trying to get to that future first. But I just think right now it might be a bit early. But if you're looking for a cuttingedge vibe marketing, as people call this tool, then a feature like this will definitely let you experience what the future looks like and maybe even get some productive results from it along the way. On the free plan, you can try out one prompt if you're curious. And beyond that, you do have to pay. And I figured that would be worth sharing cuz this whole canvas interface with the chat on the side and the agent just working with itself is something that I find really interesting. And clearly, these products are just getting started and I can't wait to see where they go next. But that's exactly why the show exists. We track these progress and show you the steps along the way and you decide when the point is here where you want to tap in and use this for yourself. Let's look at the next

Anthropic Education Report

release. This one is out of anthropic and they talk about how educators are actually using AI and their work. I read through this entire thing and I want to highlight a few things for you. First of all, most common use case by far 57% of the way educators use AI is to develop curricula. This includes requests like creating multiple choice assessments or designing educational games. And in second place is conducting academic research and then assessing student performance with just 7%. But yeah, developing curricula seems to be the dominant use case for educators here. And then beyond that, there's also a list of different apps that educators are building to actually help with their work like interactive educational games or academic calendars and scheduling tools. Overall, this is just really interesting. If you're in education, whether you're a student or a teacher, I highly recommend you check out this article yourself. And especially these visual parts here that show the use cases and show the different numbers. There's so much value and inspiration in here and I really thought you should see this. One last thing that I want to highlight here is this differentiation that they make between augmentation and automation. Automation being a replacement of a process and augmentation being just enhancing the human in what they're doing. And in something like conducting academic research, you can see that this really is augmentationheavy whereas developing curricula is more automationheavy where they let the AI do the full process in some cases. I mean, if you're creating multiple choice questions, it's kind of amazing at that. You don't really need to augment yourself. You can just automate the whole thing with a good prompt and the appropriate context. Interesting stuff. Enthropic always publishing very practical data on how to use this stuff. And you know, we love to see that on the show. Next up, we have

Anthropic Agent Extension

Enthropic pushing a category that we haven't seen in a while, and that is a computer use agent. You're probably familiar. It's this concept of an agent having access to a keyboard, a mouse, and using a browser by itself. There's OpenAI's agent mode that does this within chat GPT. There's many other projects that do it sort of as a standalone app. We've also seen Perplexity's comment browser which we'll talk about in a second here. People found out a way to prompt inject it and essentially abuse the browser and now Enthropic released their newest version of this type of agent as a Google Chrome extension. According to them, this is the safest way to do it right now. But they only opened up a preview of what they call claude for Chrome to a thousand select users to really make sure this thing is secure because the risks with browser agents like this are immense. I mean everything you've ever done in a browser, all the different login, payments, your history, all of that is not built for an agent to use. And I think that is really the story here. This entire idea obviously makes sense. Like, yeah, we want agents to do internet and computer stuff for us, but the makers of the internet and browsers and the way we navigate the worldwide web and all these applications did not think about agents when creating them. So, the security is not there. And now all these companies are scrambling to find a way to kind of do it anyway. But I think most people agree that none of these approaches are really working. I mean, chat with the agent sounds really interesting, but except of some people using it for research in combination with like Google Sheets, I don't really see people using it that much. Same thing goes for the competition. I mean, some people reported that Perplexity's comment is really exciting. And if you're one of those, please leave a comment below. Let us know for what you actually use that. But the fact remains for most people, this product category hasn't clicked yet. And with some of these security vulnerabilities like this perplexity comment story this week, there's even a demo video of Brave showing off how you could prompt inject perplexity comment here. If you want to see how that looks in action, the link is below as per usual. But I guess ultimately my point is this. It creates a weird combination of a product that is not that useful yet and that has many security risks and all companies are kind of scrambling to get to the finish line first with Enthropic being the newest player in the space with their newest iteration of this being a Chrome extension. I'll definitely check this out as soon as it's out, but for now, I'll just say that I personally am super curious to see what the future of the computer and the internet really looks like with AI integrated into it by design, not as an afterthought. I don't think it's going to be a Google Chrome extension, right? Nevertheless, I wanted to spend some time talking about these stories cuz this product category really matters. I want to keep a close eye on it for both myself and you. So, we look

Automation Foundations Course

at all of these apps and use cases week by week. But I do realize that a big struggle of many of the viewers of this channel is actually implementing these things. It can be really entertaining and interesting to look at them. But putting them to work is a whole different story. It takes a certain skill set to actually use some of these super useful APIs to your advantage. That skill set is widely referred to as creating automations and building agents. But what does that even mean? And how do you actually do that without any previous knowledge? Well, we saw this as a very common problem and we built one of our biggest courses to date. It's the building automations and agents foundations course which is now accessible exclusively to our AI advantage community. But hold up. If this topic interests you and you want to get started by answering questions like what are automations, what are agents? What's the difference? What's the anatomy of an automation? And which platform should I even be using? All of that is covered in the first module of our course which we made available completely for free and I uploaded it as an unlisted video on this very YouTube channel. You can find the link to that as the first link in this video's description. It's around 20 minutes, includes a lot of examples and is the ideal way to get started in the world of building automations and agents today for free. Now all the modules after that walk you through the actual application step by step are only available inside of the community and the entire course is life now including quizzes, certifications and supporting materials and the personalized support that the community brings. You can just ask your questions while learning and you're guaranteed to get an answer from our team in there. Plus, there's an active community of people automating and building things and so much more that you probably heard about already on this channel. But this comprehensive automation and agent building foundations course is a new addition to the community offering. So if you want to get started with automations today and have a premeditated path to actually get there and learn the skills rather than get another blueprint from a YouTuber that you might be able to copy paste but couldn't replicate by yourself. Well then this course is right for you and you can check out the first module for free from the first link in the description. And with that being said, let's get back to the next piece of AI news that you can use. Okay, for

Project-only Memory

the next story, this is something that I have been wishing for in a long time. Memories that are project specific. I think I actually mentioned this multiple times on the show. I love the idea of memories, but as somebody who uses this tool a ton, all of my context is carefully managed and it's within projects. And if I just enable memories across my entire account, I have my experimentative and private use cases mixing in with my work context, which messes things up. So, I keep memories up. I craft my context manually. These days I do that mainly through different projects. Now the problem is if you work within a project like this, the different conversations they don't bleed into each other. For example, if I discuss my next quarterly road map in this chat right here, it's not going to inform my next chat, which is a shame because this is my work context. I'm crafting this entire thing carefully with context document and custom instructions to give it more info about my work. And finally, finally, they ship this new feature where it starts building project related memories that don't bleed into anything else in your accounts. It's independent. So, this way I can keep memories off, but keep project memories on to carefully manage what happens. Now, it behaves a bit weirdly. This is a super new feature, and I played around with it. I'm going to tell you what I found. So, first of all, this feature only works if you're creating a brand new project. If I say test project right here, you need to click this cog wheel up here and say memory project only. And then I can create this test project. Now my issue with this is there's no place in which you can find these memories or manage them like you can with your account. As you can see, as soon as the project is created, this becomes locked. You cannot change this anymore. But if I say what is my name? It knows that my name is Eigor. Although custom instructions are off and my memories are empty. So this is super confusing. So this project level memory exists somewhere in the background. They don't allow us to manage this. see it. It doesn't even show an interface that it has been saved to memory. So this really seems more like a bug than anything else to me. Don't think I'm missing anything. I also had team members check this out and everybody else arrived at the same conclusion too. Anyway, I think this is actually a killer feature inside of Chat GPD that I've been really wishing for in a long time and I'm glad it's here. I just want to be able to also see and manage the memories. But what I have been already doing is I created new folders for my most popular projects, for example, my work context right here. And I transitioned all the custom instructions and files into here. And I'm just using these new projects that I have projectbased memories enabled now. And that's just what I'll be doing for all of my projects now. And I guess it's just going to be quietly building memories. I mean, it's better than not having it. But I really do think that this is the best workflow that you can have and that this is one of the features that has been requested by power users and they finally did it. So yeah, go duplicate your projects, enable memories for the new ones. But I should note that all of this only works if you're a premium user. On the free plans, you don't have projects. And once again, memory management. I'll be looking into how to properly port over some of the old conversations to build new memories. As of now, you can do it, of course, but you can't be really sure if they're being saved, as you just saw in this little demo right here. Okay. In

Runway Game Worlds

this week's quick hits, the segment where we just have a quick look at multiple stories rather than actually testing them or discussing them further. We have a very colorful docket here. Starting with Runway's game worlds. Runway, the video generation company, yet again looking for another way to use their models for something productive. In this case, it's the image generator. And basically, it allows you to create various worlds, which is yet another take on these image generators. If you're not a comic book designer or a game designer, probably not relevant to you, but I thought that was interesting.

NotebookLM Update

But the next quick hit here might be more relevant and useful to most viewers of the show, and that's a Notebook LM update, where now the video overviews are available in over 80 languages, and the audio overviews were upgraded to even more languages, which is massive. If you're not familiar, Notebook LM is the app if you want hallucinationfree AI usage that is just restricted to the documents you give it, and it takes a lot of documents. We have a great little tutorial coming on the channel next week, so make sure to subscribe so you catch that one. It's like four minutes and it covers the entire app with a bunch of pro tips. And yeah, some of the features now work in 80 languages, which is impressive. Another quick story is

ChatGPT Quiz

that Chetchd silently added the ability to quiz yourself. And if you say quiz me on Quinton Tarantino movies in quiz GPT, by the way, this was a prompt of the week in our weekly newsletter from this week. So if you always want the newest prompts and free apps that you can use if you sign up there is our onwarding sequence but after that it's just a newsletter and yeah there we shared this little prompt that you can now use too and it creates this beautiful flashcards right inside of chat is the view feature reservoir dogs palm. com I think it was fiction yeah the bride I mean that's easy damn I'm kind of cracked at this crystal fault there you go five out of five one of my favorite directors Hey, and that right there, ladies and gentlemen, is essentially everything we have for this week. I hope you found something that was useful or inspirational to yourself. My name is Igor and I hope you have a wonderful

Другие видео автора — The AI Advantage

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник