Googles AI Innovations, OpenAI Agents & More AI Use Cases
24:15

Googles AI Innovations, OpenAI Agents & More AI Use Cases

The AI Advantage 14.03.2025 63 222 просмотров 1 654 лайков обн. 18.02.2026
Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
This week is packed with major AI updates! Google is rolling out several big announcements, including updates to AI Studio and the new AI Mode. But OpenAI isn’t falling behind—there’s big news on their latest writing model and platform expansions. This week is especially exciting for AI developers, but there’s plenty more for everyone else too. Links: https://aistudio.google.com/prompts/new_chat https://blog.google/products/search/ai-mode-search/ https://blog.google/technology/developers/gemma-3/?linkId=13397566 https://github.com/camel-ai/owl https://x.com/a16z/status/1897738976934642168/photo/1 https://manus.im/ https://www.tavus.io/ Chapters: 0:00 What’s New? 0:48 Google AI Studio Update 4:59 Google AI Mode 6:32 Gemma 3 7:38 OpenAI Update 8:54 OpenAI’s Writing Model 10:29 Operator Available Everywhere 10:50 Manus AI 15:23 OWL 15:57 Responses API 17:14 Agents SDK 19:00 Tavus.io 21:05 Top 50 AIs #ai Free AI Resources: 🔑 Get My Free ChatGPT Templates: https://myaiadvantage.com/newsletter 🌟 Receive Tailored AI Prompts + Workflows: https://v82nacfupwr.typeform.com/to/cINgYlm0 👑 Explore Curated AI Tool Rankings: https://community.myaiadvantage.com/c/ai-app-ranking/ 🐦 Twitter: https://x.com/IgorPogany 📸 Instagram: https://www.instagram.com/ai.advantage/ Premium Options: 🎓 Join the AI Advantage Courses + Community: https://myaiadvantage.com/community 🛒 Discover Work Focused Presets in the Shop: https://shop.myaiadvantage.com/

Оглавление (13 сегментов)

  1. 0:00 What’s New? 191 сл.
  2. 0:48 Google AI Studio Update 1010 сл.
  3. 4:59 Google AI Mode 358 сл.
  4. 6:32 Gemma 3 257 сл.
  5. 7:38 OpenAI Update 298 сл.
  6. 8:54 OpenAI’s Writing Model 395 сл.
  7. 10:29 Operator Available Everywhere 80 сл.
  8. 10:50 Manus AI 1073 сл.
  9. 15:23 OWL 147 сл.
  10. 15:57 Responses API 291 сл.
  11. 17:14 Agents SDK 408 сл.
  12. 19:00 Tavus.io 471 сл.
  13. 21:05 Top 50 AIs 770 сл.
0:00

What’s New?

all right so the common story amongst all the AI releases this week really has been things merging on this show I regularly point out that all the little features and GitHub repos or previews that we might look at will eventually be implemented into the software that you're already using or the big player is just bundling it into one product and we've seen a bunch of that this week most of it happening by Google in both the Google search and image generation Arena but beyond that we have more interesting releases some of them really hyped like Manus AI that claims to be the Chinese operator that actually gets things done and then we have a brand new agentic framework coming out of open a again bundling multiple things together so we'll talk about all of that and more in this week's episode of AI news we can use the show that looks at all the generative AI releases from this week and we have a first look or test all of the ones that according to the advantage team matter all right so I want to start
0:48

Google AI Studio Update

this week's video with the Google AI studio and the image generation capabilities that were added here and even if you might not be interested in image generation I think this advancement is really a big one because as I pointed out it's starting to bring things that we've seen in places isolated by themselves into one interface and concretely they've been stepping up the multimodal capabilities of their Gemini 2. 0 flash experimental models the main model that you also find in Gemini Advance their chbd competitor and the point here is this up until now if you wanted to generate an image well you tell it and you get a result if you wanted to edit that image there were other applications that did just that and if you wanted to stylize the image you could also do that with specific applications like ma Journey but you have to use specific commands and it just took a little bit of knowledge to get to the point whereas here all of it just happens through chat at the highest quality level which is probably the most userfriendly interface to do this so let me just show you so I just switch this output format to image and text and I say generate an image of a cat with a head voila we have that there pretty standard stuff but this editing capability is the new thing here so if I say now make it hyperrealistic let's see 4. 7 seconds oops maybe the word was photo realistic so I'll just follow up prompt by saying that okay so there you go that didn't work how about remove the Hat nice turn the cat into a tiger and as you can see you can really iterate over this with text just like you could over chat messages and something like chat PT there you go a tiger what about an angry tiger and add the hat back in there you go not very angry okay it's not doing the angry thing but I think you see the point here I want to try one more thing which is uploading one of my own images so let's see I just have one of these thumbnails laying around on my desktop change the person's expression to be more excited I'm really curious if it can do something like this oh my God it did it that's terrifying here's Johnny but also kind of powerful okay what if I say more excited one more time oh my God it's going crazy this is starting to look like JD fans okay it also changed the image though what about text editing change the text chat GPT to Gemini for example chat GPT Gemini sort of I'll try one more prompt to fix the text okay now I removed everything else but it sort of did it so as you can see it's not perfect but honestly this first one is really damn impressive I haven't seen another tool executing this that well I mean their Imaging model is one of the best so this is pretty damn impressive and I would expect interface like this eventually making its way to all chat Bots because why not it's just the best implementation let's round this out by saying now turn this image into a cinematic movie scene and seeing what it does on something a little more creative like that okay not bad honestly that background editing the color shifting I mean that looks really good if you kind of cover up the part of the screen that shows the mutated me you could legitimately use this for graphics work if you need to get things done quickly and the expression change impressive and this is me just using a random Google account here in AI studio and as you can see I didn't even link a API key a certain amount of tokens is free of charge here so you can just go and try this for yourself and see what you make of this Innovation okay followed by that there's one more Improvement here in the AI Studio which I really like and that's the way to integrate YouTube videos let me show you quickly if I go over to YouTube and let's just say I want to take this video from Google deep mind I can all of a sudden do this summarize this paste the link and run this prompt like so and it just need takes in the video and this just seems like such an intuitive thing to happen but if you think about it none of the other chatbots really do this I mean if I put this into chat GPT it hallucinates up something from a different video look at that that's not even title of this video this one is Gemini Robotics and not Alpha Go the movie and T just immediately pulls up the video and summarizes it for you now Google products have always been best with YouTube videos but I just wanted to highlight this because I know a lot of people do work with videos and online videos for either content creation or their research and to me this just does seem like the best interface to do that right now and I reckon over time we'll just see a feature like this in every llm platform out there just quickly wanted to show you that this is available today for Google's AI Studio by the way just a little side note I currently am a little bit under the weather it was actually my birthday this past week so I had a little birthday weekend I had the best time but I don't know I suppose being 31 years old and doing a lot for a few days straight kind of gets to you so I'm feeling better already but yeah not really at 100% yet I hope you still enjoy the content though let's get back to the next story
4:59

Google AI Mode

okay next up we have another Google story and this one is them essentially expanding their AI search features and essentially what they're doing here and this is just rolling out some people have access to this for other regions it's coming but they're adding a new tab to Google searches and it just says AI mode and it's essentially a perplexity that's built into Google Search now I myself don't have access to this but team member Daniel actually got access to this already and he tried it out for us and right here in the screen recorder you can see how this performs in practice so basically the journey starts with google. com as it usually would like chickfila news 2025 I mean what else would you be looking for in your free time and then here at the top next to the normal Google stories they have a new tab called AI mode which gives you functionality just like perplexity or chat GPT search would too now this is different from their initial approach where they kind of forc this AI summary into the top of their Google Search now they're putting it into a separate Tab and considering how much traffic Google already has from the looks of it this is probably bad news for the competitors like perplexity either way they're just playing around with this and considering how convenient AI searches can be and how much adoption they've seen already I would expect this as a fixed new tab within Google searches sometime soon it just makes sense if you want different links you still have Google but if you just want the info well the AI mode will give it to you and this is sort of part of a bigger Trend right many of these services that we kind of point out week by week as they release they come out people try them and if they're really sticky these bigger players rebuild them integrate them into their applications as they already have the traffic and therefore the distribution okay now let's move on to the next one okay and
6:32

Gemma 3

there's another big AI release coming out of Google this week and that's the release of their GMA free model that as they claim is the best single model that you can run on a GPU or TPU they show this ELO score in chatbot Arena where people rank outputs of different models and how well this performs better than Lama O3 mini slightly worse than deep seek R1 but this model is really small I mean look at that 27 billion parameters and deep seek R1 has 671 and a 27b size you can usually run on a lot of MacBooks but they do say that the performance is optimized for NVIDIA gpus and I got to say all the benchmark scores look really solid for a model this small and if I read this technical report correctly 64 GB of RAM should be enough to run this 27b model on your machine locally so in practice with 64 gigs the full context size could not be used this would require too much memory but if you're okay with let's say 8,000 tokens of context then yeah this could probably be ran on 64 GB of RAM but there's always so many factors involved with this that it will really depend on your machine but the fact is there's a new model like almost every week and Only Time Will Show If people really like using this but the trend of smaller and smaller models being more and more capable continues here there seems to be
7:38

OpenAI Update

some cat GPT update every single week so I really do want to cover that because I do happen to know that most viewers are actually using that as their daily driver in terms of llm platforms and what they implemented this week was yet again something developer focused it's the ability for chat to write code directly in your IDE or code editor for that you do need the desktop app and as you can see in this little demo it will just connect to it and with something like this you don't need a co-pilot in your IDE anymore sure there's some functions that it still performs better and know I'm not a full-time developer that could appreciate every single little feature that a little co-pilot might bring inside of an ID but I think for most people having a workflow like this where you have a free code editor like vs code and you have it write code right inside of your code editor and you don't have to copypaste things is an absolute blessing and unless you need one of the specialty features that some of these co-pilots bring this is a fantastic feature that replaces the core functionality which is seeing your code base and adding to it and in the context of this trend of vibe coding that we'll also be talking about today you don't even really need to know exactly what's going on in these files you just need to be able to talk to chat GPT and you need to have a vision for what outcome you want but we'll talk a little bit about that in the section of VI coding as that is a trend that is increasingly becoming popular and I also like to cover those
8:54

OpenAI’s Writing Model

on the show yet another cat GPT update is a tweet from Sam Alman and this is not a release that came out this week but I wanted to point it out and it's essentially him saying that they trained a specific model for creative writing the models we see right now are not focused on that all the reasoning models they're focused on math science and coding you have to realize that writing is not the main focus here and if you haven't seen this yet I do recommend you pause this video for a second and read these three paragraphs I won't be reading all of this out to you but as a AI writing coner myself I can tell you this is like nothing we've seen before 4. 5 right now in my opinion is state-of-the-art when it comes to writing yet that model is still not optimized for it and a model that is specifically trained to write well is just a whole different Beast I mean this is high quality humanlike writing and one thing that I'm actually curious about and I haven't done before what happens if I take this entire text here and throw it into something like gp0 which is a AI Checker and there you go it says we are highly confident this text is entirely human and this is AI written So for anybody who has been doubting AI writing before just think of the fact that these models are general purpose and they're meant to do everything just wait for the ones that are specifically made for creative writing or use gbd 4. 5 and you might be positively surprised but let me tell you formulations like this I've never seen an AI model before this is really good so whenever you find yourself criticizing outputs of llm or somebody else is doing that just ask the question hey are they using the best model for that or has that model even been created yet or is this a problem yet to be solved in the coming months and in the case of writing hopefully we'll get to live test a model like this soon I personally am looking forward to that I thought this was important to cover but now let's move on to some releases that are actually available today another cat
10:29

Operator Available Everywhere

GPT update this week is that they made operator available to all regions now you don't need a VPN anymore anywhere in the world so if you have a pro account which is a $200 account you can use operator now but honestly after testing it thoroughly many of the things that would really be Time Savers and productivity unlocks just don't work yet and that's where we enter another story of this week which was Mano AI so this
10:50

Manus AI

little application out of China has caught so much attention upon release on X that people are freaking out that hey this next big Chinese release but then I think it's fair to say that the hype kind of died down quite quickly and essentially what Manus is a better operator that's how they present themselves and that's also how it works in practice they've actually released this and although this is not publicly available to everybody you can apply for an invitation code that first they were giving out quickly then slower I did manage to get my hands on this and had about 3 hours of play time in total because they weren't my own access codes and accounts one of them was a community session where one of my office hours we kind of just played with the thing for one and a half hours and thre different prompts at it and here's kind of my summary especially when contrasting it to something like operator or anthropics computer use which with both of those have spent a substantial amount of time and tested I don't know got to be over a 100 use cases for both of those with different approaches this thing Manus is better in certain ways and way worse in others the main way in which it's worse is simply stated it cannot use any accounts and as most of the internet is behind the account that's a massively limiting factor I mean if you think about it every social media platform form your email your Uber Eats account heck even something like Google Sheets or Google Docs all of that is hidden behind your simple login but it cannot use those because Manus is built upon a combination of Sonet 3. 7 on propix model that's really good at multi-step thinking and coding planning logic it's a really good model for something that's gentic like this and then two they have their own version of the Chinese quen model which they optimized to work here and because they're using mopic under the hood mopic is super limit lied in terms of stuff like using your accounts or saving passwords and by limited I mean there's a red line that you just cannot cross and it doesn't let you do most things just a quick reminder that mopic is kind of the most conservative company in the space when it comes to AI safety I suppose grock free by xai would be the other side of that Spectrum so even if you look at the very first example of their website planning a trip to Japan one of the classic examples that these companies love to do it is super impressive because it uses a combination of Sonic 3. 7 and a web browser just like operator and then it runs codes and to do certain calculations and it does research for you and puts it all together and that's a good use case but at the end of the day you could be using something like 01 Pro to put together a Tiner like this or you could be running a deep research and matter of fact let me tell you it actually does a better job like look at that it's way more detailed it has the opening times it makes more tailored recommendations surprised me personally more often but that depends on how you prompted and then in the end it created this massive table of all the different places and the opening times in the places for me and I'm actually planning a Japan trip now for April and what I'll do is I'll just simply print this table bring that with me and I have a more detailed DET tenary than this put together and I could keep speaking here for another 20 minutes and talk about different examples and prompts and how differently they perform but essentially it all comes down to this due to the fact that it cannot use accounts online a lot of the things that actually work inside of operator like ordering groceries or booking trips for you don't work inside of this fair enough but you might say those are not the most useful use cases well true and if you look at these none of these are really things that you would want to do with operating these are things that you would more likely be doing with deep research so what I kind of found by myself without even referencing any other creators or anybody others reviews here is that this is more of a deep research competitor than it is operator competitor because nobody really knows what goes on in the background of deep research maybe it writes and executes its own code and obviously that has a web browser and a thinking model plugged into it too it's just that the thinking model of deep research is O free and this is son 3. 7 and O fre is still ahead of the curve of everyone else it just is these responses just don't cease to surprise me and I still run deep researches most days of the week so if you're looking for something like B2B suppliers or a list of YC companies or analyzing a stock well these are things you could be doing within deep research but I think the thing that you get here is a more interactive experience it shows you a to-do list it uses different tools to complete it and it works really well in many of those cases unless you need to use some account which is also a limitation of deep research but that's why I kind of concluded that this is more of a deep research competitor and it's just another slightly weaker D research in my opinion and it's fine it's just not the new opening eye killer or the next Chinese deepseeker or whatever some people on X make this up to be and again this is just based of a few hours of usage and I might change my opinion over time when I try more use cases or see specific things that go Way Beyond anything that I've seen so far but so far good product but definitely overhyped in my opinion just my two cents here feel free to discuss in the comment section maybe I'm missing
something and if you don't have an early access code and want to get your hands on something similar today there's already a open source repo called Owl I didn't really play with this but it's a Workforce with multi-agent assistance that does all of the things that Manos does and you can kind of customize it and as with any the other open source Alternatives I would probably expect this to perform a little worse again this one I didn't actually try I just wanted to point out that it just popped up on my radar and I wanted to show you that people are rebuilding this already and on this Gaia Benchmark that Manos kind of boasts here they achieved the score at 58 which I suppose puts it behind both deep research and Manos but it's open source okay on to the next one
15:57

Responses API

which is the open AI responses API and their new agentic framework okay so this one is obviously a very developer focused release from open Ai and they basically launch a brand new API which includes things that you really didn't get access to before and it's all under the hood of one thing that they call responses API and really simply explained I would say that before you had an API that responded with text and if you wanted to search the web and get responses from that you needed to use a different API and if you wanted to upload images build a functionality into your app where you could upload docum doents and you would get an AI response well you guessed it you needed a different API concretely that was the assistant API from open AI before now they kind of combine all of that into one thing which is just called the responses API so you just send it at request and this responses API just responds it includes internet search File upload and even operator access which allows the API to kind of use a computer and browse the web and do things pretty cool I'm excited to see what people built with that as I said operator is not extremely capable yet but for little tasks across the web it can be useful I personally sort of stopped using it since it ordered like two or three dozens of bananas to wrong address in Lisbon and then I made another order and it ordered another basket full of groceries to yet again the wrong address but hey all that is behind the API now and you can kind of call it programmatically which is
17:14

Agents SDK

amazing and then they came out with this second thing called the agents SDK and rather than explaining that I want to show you example from stripe that I really liked and that is example of how stripe implemented this with their payment services and it's an example of a freelancer having sort of a list of all the people he served that month and in some form he expressed that hey these people have paid and these people have not paid this is a perfect example for an AI tool I mean look at this little table right here it's a little check mark or a cross maybe this was created in Ocean or heck this could even be a handwritten note that you just provide to the agent but as you can see it lists all the clients and then together with have two little agents one responsible for searching the files and one responsible for actually making the stripe invoices from stripe side the agents could kind of look at this little table and go for everything and automatically issue and send out emails with all the invoices that were still missing and this way you can weave together different functionalities different apps and create these little programs that run with agents in the background and get everyday tasks done now look this or the second example that stripe also showed which I thought was amazing too which is basically having an email agent that scans for emails like hey I need an invoice for this purchase and then the agent automatically fetches that invoice and sends it back to the customer I want to be clear these aren't things that were not possible before they were just a bit more convoluted you had to use a specific platform and multiple API end points to kind of make everything work together and now with opening eyes own SDK this just became so much simpler and simpler also means it's easy to troubleshoot and it's more reliable so yeah all these agentic workflows are really happening step by step week by week we keep progressing they keep getting better and I'll report back if I see any big unlocks that are relevant for a wider consumer base but there you go I think especially with that responses API open a really delivered for all devs that use Ai and their applications okay next up is kind
19:00

Tavus.io

of a fun one and we like to throw in live demos and this is not affiliated but we found this new application that does something new we've seen similar things with haen and Synthesia and other players doing AI video but this company tavos doio is really connecting a lot of these tools and their claim is that hey all of these chat Bots shouldn't actually be interfaced like chat GPT they should be a person sitting there talking to you just like I am right now and that's what their product does and they have a little demo which I always like doing during this videos so let's give this a shot start a video chat with Charlie over here just get out my headphones so I can actually hear Charlie and let's start this video chat yo what up Charlie how you doing today I'm doing great thanks for asking it's always awesome to meet so ah listen man you're a bit robotic over there and also I don't know it doesn't feel so fluent what you got to say for yourself I see what you mean I can come across a bit robotic at times but hey I'm working it man I'm a large language model so sometimes I might sound a bit stilted but I'm always trying to improve and be more like a human you know like a cyborg but without the cool robotic limbs brother I love your okay that's a great okay fair enough uh please first of all don't interrupt me secondly can you tell me a joke sorry about interrupting you earlier I'll make sure to let you finish your thoughts from now on okay here's a joke for you why did the neural network go to therapy because it was struggling to process its emotions hope that one computed correctly and brought a smile to your face um actually no Charlie didn't but there you go I wanted to do this to show what a mix of these different tools might look like I think it might be super early and obviously this product is not at the stage where it's going to be changing the world yet but I think we're getting closer and last week we looked at some of these voice generation models that sound so convincing here they tried to PLU multiple things together with a result that might not be that convincing yet but I wanted to do this little segment to point out that hey we're getting closer every week and this is something you can keep in the back of your mind as we review some of the releases eventually they will all be combined into one thing all right on to the next one which is a overview of the top 50
21:05

Top 50 AIs

geni consumer web products as published by a16z about a week ago I just quickly wanted to Spotlight this if you ever were curious what people are actually using so as you might know at the AI Advantage we have our own rankings but this is just sort of a big overview based on traffic and also keep in mind they are a venture fund that specializes in technological Investments so obviously these will be tinted by their economical interest you know so take a list like this with a grain of salt nevertheless it's a good list and I'm kind of proud to say that I tried and tested most of these there's a few curious ones which I haven't heard of before what is Joyland or crush on AI oh and yeah I've also haven't tried spicy chat but that's not really the domain of apps that I try I mean I haven't visited it but I can imagine what spicy chat would do naughty but if you wanted to explore some new apps well this is one list to do that and also this is a great point to yet again remind you that we do a ranking of our own that we update every single month though but this is purely based of my the teams and our community's personal preferences these are the apps we see members actually using day-to-day with reasoning why they're up here and if you have a gripe with this list you can leave a comment below and we can discuss this down there but we update this every single month we do this for llm platforms image generation tools and video generation tools on a monthly basis kind of considering a Vibe coding ranking right now that might come soon let's see if that makes sense either way if you're looking to get oriented within the space this is probably a great starting point and also I want to add one more comment here which is I know a lot of people come to this channel as complete newcomers to Ai and they want to just start from the beginning and arguably this video might not be the very best place for that we reference information and releases from past weeks and this is really about staying at The Cutting Edge and finding out about the new things that keep coming out and not about building foundational skills but I realized that many people want to do that so if you're just getting into this I have two recommendations for you that we built one of them free and one of them paid the free one is our Weekly Newsletter we release it for free and we put together our onboarding sequence and you get a prompt template right in the beginning with some of our favorite prompts and techniques to get you kickstarted right away everything newsletter related is completely free and we don't even sell sponsored slots because we want to keep the experience as pure as possible there but in the process of receiving some of those materials we will recommend our community which is really our premium experience and the ultimate answer to how do I get into AI how do I actually learn some of these skills or where can I get my questions answered if I have any the community features a structured onboard and multiple courses that will teach you foundational skills you can pick the ones that matter to you and then we run around 20 events a month and release weekly guides and resources to keep you up to date even if you don't engage with them on a weekly basis the structured workflow guides and the video courses for example in March we're releasing a brand new one on fine-tuning models the very best technique in the world to make llm sound exactly like you all of that comes with a community membership so if you're interested in actually learning this AI thing from the ground up then both the newsletter and the community are the very best ways to do that and that's really all I got for this week a lot of developments a lot of things merging as I mentioned in the beginning and again I realized this stuff is not easy to keep up with so if you're just getting started sign up to our newsletter and other than that I will see you very soon on another episode of AI news you can use but this is all I have for today see you soon

Ещё от The AI Advantage

Ctrl+V

Экстракт Знаний в Telegram

Транскрипты, идеи, методички — всё самое полезное из лучших YouTube-каналов.

Подписаться