AI Image Revolution, Gemini 2.5 Pro & More Use Cases
18:15

AI Image Revolution, Gemini 2.5 Pro & More Use Cases

The AI Advantage 28.03.2025 50 258 просмотров 1 392 лайков обн. 18.02.2026
Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
This was a massive week in AI with a ton of new best-in-class tools releasing. We've got the new ChatGPT Image, Gemini 2.5 Pro, Ideogram 3.0, the new Reve image generator, and so much more. Watch the video to learn how to use all the new AI tools and features! Links: https://www.anthropic.com/engineering/claude-think-tool https://x.com/sama/status/1904957253456941061 https://blog.google/technology/google-deepmind/gemini-model-thinking-updates-march-2025/#building-on-best-gemini https://huggingface.co/deepseek-ai/DeepSeek-V3-0324 https://openai.com/index/introducing-our-next-generation-audio-models/ https://x.com/karpathy/status/1903671737780498883 https://x.com/gfodor/status/1904980153329242147 https://lmarena.ai/ https://preview.reve.art/ https://ideogram.ai/ Chapters: 00:00 What’s New! 00:31 ChatGPT Image Generation 01:46 Gemini 2.5 Release 04:49 GPT-4o Update 05:43 DeepSeek V3-0324 06:45 Anthropic Web Browsing 07:14 Anthropic's “Think” Tool 08:23 OpenAI Adopting MCP 09:34 OpenAI Audio Models 11:29 Mobile App Vibe Coding 12:43 How to Keep Up with AI 15:33 New AI Image Generators #ai Free AI Resources: 🔑 Get My Free ChatGPT Templates: https://myaiadvantage.com/newsletter 🌟 Receive Tailored AI Prompts + Workflows: https://v82nacfupwr.typeform.com/to/cINgYlm0 👑 Explore Curated AI Tool Rankings: https://community.myaiadvantage.com/c/ai-app-ranking/ 🐦 Twitter: https://x.com/IgorPogany 📸 Instagram: https://www.instagram.com/ai.advantage/ Premium Options: 🎓 Join the AI Advantage Courses + Community: https://myaiadvantage.com/community 🛒 Discover Work Focused Presets in the Shop: https://shop.myaiadvantage.com/

Оглавление (12 сегментов)

  1. 0:00 What’s New! 124 сл.
  2. 0:31 ChatGPT Image Generation 293 сл.
  3. 1:46 Gemini 2.5 Release 660 сл.
  4. 4:49 GPT-4o Update 191 сл.
  5. 5:43 DeepSeek V3-0324 246 сл.
  6. 6:45 Anthropic Web Browsing 107 сл.
  7. 7:14 Anthropic's “Think” Tool 276 сл.
  8. 8:23 OpenAI Adopting MCP 262 сл.
  9. 9:34 OpenAI Audio Models 480 сл.
  10. 11:29 Mobile App Vibe Coding 289 сл.
  11. 12:43 How to Keep Up with AI 745 сл.
  12. 15:33 New AI Image Generators 658 сл.
0:00

What’s New!

so this week in generative AI we had multiple best inclass releases meaning a new tool came out that instantly took the first spot in its respective category concretely I'm talking about open AI image generation Google's Gemini 2. 5 pro model and deep seek v3e and deep seek decided to open source this one again so just between those stories there's so much to talk about but as per usual there were even more AI releases that you can put to work today and we'll look at all of them in this week's episode of AI newsic can news the YouTube show that rounds up all the generative AI releases that you can put to work today let's get into it okay so
0:31

ChatGPT Image Generation

first up we have opening eyes image generator that you might have heard about at this point well let's just say if you've seen an image in a style of Studio jibli this week chances are it was generated by this new openi update but as you might know I already created two videos covering this release one super short 4minute one and one 20-minute video comparing it to all the competition out there so if you care to learn more about this release check out that video and later in this video we'll be doing a comparison with some of the other releases because this is not the only image generated that came out this week but for now I just want to show you one conversation that Dom from our team had with the new open a image generation model look I created a 3D model of a black labrador on a transparent background here then he prompted for a different view of the Labrador and then he continued to create a screenshot of a video game with the labrador as a character now as there are two views of it already to be referenced in this create a character selection screen here and then restyle the game into a 2D pixel art styled Adventure game like so and I wanted to show you this because it really shows off the versatility of this tool as it's a mix of llm and image creation model if you haven't heard about this yet definitely check out the two other videos on the channel and we're already working on a dedicated use case video because there's so much you can do in here that I feel like it's worth another video but now
1:46

Gemini 2.5 Release

let's turn our attention to the next big release this week and that's the one that was overshadowed by this image generation it's the Google Gemini 2. 5 pro model which right now is already accessible through Google's AI Studio as per usual for Limited time you can try it out there you simply switch on over the model to Gemini 2. 5 Pro experimental and then you can use it in here or Google's Gemini Advance their chat GPT competitor but here you do need a paid plan to access this new model so what's so special about this well some people consider this the very best thinking model ever released the only potential competition being 01 Pro by opening eye but when it comes to benchmarks this really crushes everything else on many dimensions I mean look at this graph compared to some of the other top thinking models out there like Claud 3. 5 Sonet or all fre mini high it's ahead in almost all benchmarks including an impressive 18. 8% on Humanity's last exam the notoriously challenging Benchmark that well it just win that but as you might know the benchmarks only tell a part of the story often it's really about the other things like context length or the tool use or The Vibes of the model and how people adapt to it and if they use it and this model checks a lot of those other boxes it's integrated into Gemini studio so you get all the tooling that comes with that which is nowhere close to as deep as what open ey has now but it's solid and maybe more interestingly so this model comes with a 1 million token context window which you can see in either the announcement or in Google AI studio and more importantly the performance of this long context window is exceptional so I want to highlight this Benchmark score particularly it's how these models perform with long context depending on how many tokens you put in okay so essentially how well they perform depending on the amount of text because yeah a lot of models have large context that doesn't mean they can use it well and if you look at this line of Gemini 2. 5 Pro you will see that all the way over at 120,000 tokens of context it scores a 90. 6 out of 100 which is nowhere close to all of the competition I mean look at this CLA 3. 7 Sonet thinking anthropics best model 53 gpt1 53 I suppose gbt 4. 5 is better but that's not a thinking model like this one I missing Gro free in this View and I also would love to see A1 Pro on here so that's really impressive but at the end of the day it really matters if people actually adopt this thing because look the specs and the benchmarks of this model look really good but at the end of the day the open eye image announcement completely overshadowed this by the way that timing was no coincidence they announced a spontaneous live stream about 1 to two hours after Google announc this model but most people are sort of over this notion of a better llm it's kind of hard to deny that a certain Plateau has been reached and even though these benchmarks are better and better for common users there's no real difference in utilizing these models sure everybody has their preferences and for coding specifically I think there's clearly better models and worse models but if you don't work with code a lot of this comes down to personal preference so for example gbt 4. 5 might not have the best benchmarks on everything but it is my preferred model of choice because I find myself doing a lot of psychological and brainstorming related use cases and I love it for that The Vibes are the best
4:49

GPT-4o Update

okay in just 24 hours after the release of Gemini 2. 5 Pro with all of these impressive benchmarks open AI actually reacted with update to their main model gbt 40 there's a few quality of life increases like fewer emojis better instruction following especially with multiple instructions and improved capabilities when it comes to complex or coding tasks something that Gemini 2. 5 Pro excels at and also improved the intuition and creativity now here's the interesting part as Gemini 2. 5 Pro released that model top to Larina leaderboards but now with this gbt 40 update gbt 40 pulled right into second spot again and 40 is not even a thinking model so I don't know what else to tell you except that the AI Wars continue here and we're getting better and better products every single week and figuring out which one of these models will work for you on your specific use cases absolutely goes beyond the scope of this video I'll be sharing my opinion on these over time as I get more experience with them nevertheless we're talking about this cuz it's a real advancement
5:43

DeepSeek V3-0324

as is the next story which is very similar by the way so I'll keep it a bit shorter China's deep seek released another model and this is a non-thinking model so this is not the successor to their R1 model that was the thinking model this is important okay they released V3 this is their non-thinking model so something that Direct ly competes with let's say GPT 4. 5 and again it crushes the benchmarks now the interesting thing about this one in particular is that they released it under MIT license which means it's open source now sure it's really a give and take on some of these benchmarks if you look a little closer into all of this but basically it's on par with 4. 5 and Sona 3. 7 and quen Max which are fre of the models to beat here but this thing is open source so yet again just like with R1 they just put this out there anybody who once can download this model use it with their own applications no API no cost per usage none of that just llm that performs at the intelligence of the best models in the world but freely available well that's pretty wild and these releases are exactly what pushed the Western AI companies to ship like crazy which they are doing so let's move on to the next story so you can hear all about this and the next let's say
6:45

Anthropic Web Browsing

cluster of stories comes out of anthropic the first one is something that I barely missed last week it came out Thurday night for me and I had to record Thursday afternoon and it's web browsing for anthropics clot pretty straightforward as of now not available in the yet and honestly even in preparation for this video I didn't even bother with a VPN or anything because we're going to run our more in-depth tests on this thing but basically every other tool has web browsing now so they were really just catching up on something that was very obviously missing inside of their claw chatbot but
7:14

Anthropic's “Think” Tool

more interestingly I want to cover their release of this think tool that in there words enables CLA to stop and think in complex tool use situations now this is a bit complex but I can simplify this entire release with the following statement right now you have non-thinking models and thinking models the thinking models think before they give you an answer the non-thinking models don't they just give you an answer right away start generating right away this is a thinking approach that is used with a non-thinking model so the model starts generating the output but once it arrives at something that might warrant some thinking then it uses this think tool it sort of just selectively thinks when it's needed and I think this direction is really the future of many of these products you're not going to have a model picker that looks like this where you need to know what all of these do best and when to use which one that's the state of it right now as they already announced there's just going to be gbt 5 and it's probably going to use something like this in the background where it selectively decides when to think or when to just give you an answer where to generate an image or when to Vibe code an entire application for you andropc has proven to be ahead of the curve on many of these releases so looking at things that they're releasing like this thinking tool always gives a bit of an Insight of where the ball is heading next when it comes to these consumer products and another exciting
8:23

OpenAI Adopting MCP

development for anthropic is that openi is actually adopting their model context protocol the shortcut is mCP you might have heard it before we talked about it on the show before across their products now this is incredible and something that most people when mCP launched wouldn't consider realistic just as a quick refresher mCP is basically when you give an llm various tools and it's an open standard so basically the tools get hosted on a server and then you can connect to that server right here on this machine I gave it access to a few mCP servers one of them gives it web search another one gives it the ability to actually manipulate files on my computer so I could create folders on the desktop and move files around things like that and these mCP servers are sort of standardized tools that will now also be usable with the chat GPT desktop app and their responses API and the new agents SDK that they talked about and this is absolutely incredible this is really a step in the right direction and I was super surprised by seeing this because opena has been the most restrictive trying to maximize all the value they create for their shareholders so yeah this is very exciting news more on this when this actually ships admittedly this is just an announcement right now but it f too well with the entire Cloud segment so let's move on to the next piece of AI news that you can actually use we chat open AI audio
9:34

OpenAI Audio Models

models that also shipped last Thursday night so I just briefly want to talk about this but basically they shipped new developer tools to make it really easy to create voice enabled chat Bots and this covers all aspects of voice text to speeech speech to text and transcription through various models like an updated whisper a new speech to text models including a small one that is really cheap basically everybody who has an app and who wants to include voice AI features can now do it really simply at the highest quality level and I want to quickly share something here because I love to give you a bit of an insight into the different things we do at the advantage beyond the YouTube channel one of them is the community as you know but the second one is also increasingly Consulting with companies who are building products in this space to help them either keep an eye on all the things that are out there or to bring an informed opinion to the table of what users actually want or which services to use and one of these that came up over the last week was the various Audio models so we created an entire ranking and it basically breaks down to this comparison table which I quick quickly wanted to share with you here and as you can see the audio API by open AI ranks highest here next to both scribe and Sonics now I'm not going to be going into the details here but basically if you're a consumer and you want to take advantage of some of these voice features or transcription features at scale your best bet is probably 11lbs scribe or this Sonic servage which actually surprised us but it works surprisingly well it's just a bit slower than the 11 Labs one and if you're building app there's really no competition to all the new API endpoints that openi opened up with this release I want to say that with everything that we at the AI Advantage I always pride myself in going a little deeper than we have to we love to run tests and have multiple people try things out before we give you some sort of opinion and as the company grows I hope that the quality of the information that we can provide you for videos like this is only going to keep increasing but nevertheless we will be nothing without you the viewers so thank you for being here for being interested and curious truly we would be nothing without you and let me just speak a heartfelt thank you on behalf of the entire team I'm pretty confident in saying we all really love doing this and I hope you enjoy watching it too now let's move on to the next one and this
11:29

Mobile App Vibe Coding

is just a quick feature of a use case that I found particularly interesting and that's Andre karafi ex head of AI at Tesla an open AI co-founder basically built an entire iOS Swift app with AI and this is something that doesn't get talked about too much people usually talk about websites maybe a SAS business or some WordPress plug-in things like that but it can also build iOS apps and he shared the various chats that he had with chat GPT where you can see all of his prompting and his process here which ultimately results in a legitimate iOS application now he never did this before so this comes from the perspective of well maybe not a development beginner but a beginner when it comes to coding iOS apps by the way if you're not familiar Andre also coined the term Vibe coding I think he was the first one who mentioned that and really caught on and also he's originally from Slovakia which is actually my home country my passport is Slovak my parents are Slovak I just happened to be very worldly in terms of where I live and how I was raised but hey go Slovakia actually fun fact his second name is sort of almost identical with the Carpathian Mountains which go all the way into our capital and the end of that mountain range actually houses the Bratislava Castle on top of his name sort of anyway I thought it was fascinating to see that you can build a mobile app with Vibe coding and here you have the entire blueprint of how androed this step by step pretty interesting if you ask me now on to the next one if
12:43

How to Keep Up with AI

you're viewing this video chances are you realize how fast AI moves it's a new feature a brand new model every single week and for M mortalis it's almost impossible to keep up well that's why I want to show you two things one of them is completely free and paid and we at the ad Advantage create both of them the free one is the llm rankings that you might already know about but this really is the answer to the question hey if I only have 10 minutes a month and I want to stay on top of AI what do I do well you check out our rankings that we update every single month with a quick glance you'll be able to see what tools are at the top right now for this particular month and if you're curious about one of them you can scroll down and look at the reasoning this is freely accessible and we do this across llm platforms image generation tools and video generation tools every single month now if you're looking for more knowledge that's why we build the community one of the things we do in the community is we release brand you guides and resources on a weekly basis you can get a taste for some of these in the free area of our community where we share various guides and previews of courses we have in there but let me just go inside the paid area of the community sort these by the most popular ones and for example show you this massive guide on cannabis and all the hidden features within there now there really is a lot here including various little tips and tricks that you might not have known and there's a comment section underneath of people encountering and solving various problems or just straight out tips and tricks from the community now you might think to yourself hey AI mov so quickly what's the point of doing all these guides if they're going to be outdated anyway well actually in March 2025 we went through the work and all 12 members of the team updated every single guide of our community that's over 150 guides now some had to go for a complete overhaul some just needed minor updates but pretty much two-thirds of these needed some editing to them and all of them are updated for today so whether you want to put yourself into movie scenes or get step-by-step workflows from the latest features this is the place to get it and one final note here is that you can really use these guys in a creative way and this is something I like to do because for example here's this big guide on deep research and various use cases that are a bit more detailed than what we do on the YouTube and then what I do sometimes is I just copy paste the whole guide and then you can go into any conversation here was creating some banner and you can run a prompt like which one of these open a deep research prompts could I use to improve this and if I paste the entire guide inside of here it will look at all of the content and use the stepbystep tutorials and knowledge inside of the guide with the current use case that you're working on you don't even have to read the guide you just have to know how to copy paste and then it suggests one of the prompts that could work really well with the current project that I'm doing here so here it wants me to do a competitor analysis to look at some other communities that might have similar marketing materials what I was creating to here super smart no kind of a smart complimentary use case that I might have not thought of myself and you can do this with every single one of the 150 guides all of these are handwritten step-by-step guides with all the details that chat GPT needs to help you with your tasks so it's not just that you have access to them by joining the community it's also that your AI assistant gets access to these and you get all that and so much more inside of the AI Advantage Community that's just a quick little hack that I wanted to share with you today and now let's get back to
15:33

New AI Image Generators

the video all right next up as I alluded to before there were actually two more image generator releases besides gbd 40 image and all of these are really good at text so we ran a bunch of comparison prompts but I really want to take a closer look at one of them which is this one photo of a highway road with a large billboard by the side of the road that says this text written in bold letters on the billboard cars passing by now the two other models are ideogram free and R I think that's how it's pronounced rev and these are excellent models especially when it comes to text so the competition is just unbelievable so look at them side by side this is the result from open AI R mind the Flawless text in the realistic look and this is the result from ideogram text also looking good now I do have to note that if you run this multiple times you'll realize that ideogram sometimes messes up a letter I mean it's not a lot but it definitely happens more often than the other models two out of four here were straight up wrong and CHD that just doesn't happen at this length I ran this four times and I got it right all four times and with r it's the same deal the text is just correct every single time but I would argue that for example this image is a bit lower quality I mean look at these wooden posts just disappearing into nowhere and there's a few funky details like cars on both sides of the road driving the same direction or whatever this is and you can find these consistently across the images I mean look at this one or this car just parked in the grass or this one with the sign in the middle whereas ideogram looks absolutely fantastic and so does GPT 40 image generation now here's a few more comparison images for you between idor chat GPT 40 and R so you can make up your own mind but overall I would say that just the convenience of GPT 4 is hard to compete with if you're already using chat GPT the tool clearly is the best at some of these things and at the very lease can hold its own at some of these benchmarks against the competition meaning it's equally as good as the rest or better but it has the ability to edit things with words and it's right there in chat gbt you can blend multiple images together the entire interface just super intuitive making it in my opinion the best image generator model right now but hey the reason these videos exist is so that you can make up your own mind based on the comparisons that we just showed you on screen all of these are incredible models that you should absolutely consider and just pick the one that works best for whatever you're trying to achieve and that's everything we got for this week I'll be hopping on a flight to San Diego in about 10 hours actually I'm going to conference there and afterwards I'll be going to Japan for 2 weeks to fulfill the one item on my bucket list when it comes to travel that I never got around to doing which is experiencing Japan in spring now the only thing that means for these videos is that the background will change but I'm bringing my mobile Studios so you can expect a similar quality and upload schedule as you might be used to from this channel over the past few years I haven't missed a weekly Youtube upload in almost four years at this point hey but someone's got to do it all right that's it if you enjoyed it leave a like my name is eigor and I'll see you soon

Ещё от The AI Advantage

Ctrl+V

Экстракт Знаний в Telegram

Транскрипты, идеи, методички — всё самое полезное из лучших YouTube-каналов.

Подписаться