AI News: Gemini Flash 2.5, o3 + o4, Grok Studio + Much More!
27:51

AI News: Gemini Flash 2.5, o3 + o4, Grok Studio + Much More!

Julian Goldie SEO 20.04.2025 4 656 просмотров 100 лайков обн. 18.02.2026
Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
🚀 Get a FREE SEO strategy Session + Discount Now: https://go.juliangoldie.com/strategy-session Want to get more customers, make more profit & save 100s of hours with AI? Join me in the AI Profit Boardroom: https://go.juliangoldie.com/ai-profit-boardroom 🤯 Want more money, traffic and sales from SEO? Join the SEO Elite Circle👇 https://go.juliangoldie.com/register 🤖 Need AI Automation Services? Book an AI Discovery Session Here: https://juliangoldieaiautomation.com/ Click below for FREE access to ✅ 50 FREE AI SEO TOOLS 🔥 200+ AI SEO Prompts! 📈 FREE AI SEO COMMUNITY with 2,000 SEOs ! 🚀 Free AI SEO Course 🏆 Plus TODAY's Video NOTES... https://go.juliangoldie.com/chat-gpt-prompts - Want a Custom GPT built? Order here: https://kwnyzkju.manus.space/ - Join our FREE AI SEO Accelerator here: https://www.facebook.com/groups/aiseomastermind - Need consulting? Book a call with us here: https://link.juliangoldie.com/widget/bookings/seo-gameplanesov12 In this video, I'll cover three major AI releases that will transform how you work: OpenAI's new O3 and O4 Mini thinking models plus GPT-4.1 API, Grok 3's free Studio mode that rivals ChatGPT Canvas, and Google's ultra cost-efficient Gemini 2.5 Flash model. He demonstrates how to use each tool and shows practical examples of building apps, games, and websites. Timestamps: 0:00 - Introduction 0:30 - OpenAI's O3 and O4 Mini models 0:58 - Main features of the new thinking models 1:43 - Testing O3 with game creation 2:29 - Ship identification example 2:39 - Reasoning models explained 3:13 - GPT-4.1 API release 3:42 - Sam Altman's announcement 4:16 - Three new API models (4.1, Mini, Nano) 4:37 - Million token context windows 4:53 - GPT-4.1 benchmarks 5:42 - How to access GPT-4.1 for free 6:19 - Using WindSurf to access GPT-4.1 6:44 - Comparison with Gemini and Claude 7:40 - Detailed benchmark analysis 9:54 - Summary of GPT-4.1 improvements 10:16 - API updates for developers 10:59 - Building with GPT-4.1 in WindSurf 12:57 - Grok 3's new Studio mode 13:29 - Features of Grok Studio 14:31 - Testing Grok Studio 15:10 - Creating an SEO keyword tool 16:17 - Google Drive integration 17:04 - Building interactive tools 17:54 - Workspaces and automations 18:37 - Google Gemini 2.5 Flash 19:01 - Thinking budgets explained 19:57 - Cost comparison (20x cheaper than Claude) 20:55 - Pricing on Open Router 21:30 - Testing coding capabilities 22:40 - UI and content creation 24:16 - Advanced features and speed metrics 25:56 - Other major AI news this week 26:39 - Closing thoughts and resources #JulianGoldie

Оглавление (34 сегментов)

  1. 0:00 Introduction 95 сл.
  2. 0:30 OpenAI's O3 and O4 Mini models 84 сл.
  3. 0:58 Main features of the new thinking models 128 сл.
  4. 1:43 Testing O3 with game creation 159 сл.
  5. 2:29 Ship identification example 33 сл.
  6. 2:39 Reasoning models explained 106 сл.
  7. 3:13 GPT-4.1 API release 90 сл.
  8. 3:42 Sam Altman's announcement 106 сл.
  9. 4:16 Three new API models (4.1, Mini, Nano) 48 сл.
  10. 4:37 Million token context windows 40 сл.
  11. 4:53 GPT-4.1 benchmarks 136 сл.
  12. 5:42 How to access GPT-4.1 for free 99 сл.
  13. 6:19 Using WindSurf to access GPT-4.1 85 сл.
  14. 6:44 Comparison with Gemini and Claude 141 сл.
  15. 7:40 Detailed benchmark analysis 329 сл.
  16. 9:54 Summary of GPT-4.1 improvements 52 сл.
  17. 10:16 API updates for developers 113 сл.
  18. 10:59 Building with GPT-4.1 in WindSurf 374 сл.
  19. 12:57 Grok 3's new Studio mode 101 сл.
  20. 13:29 Features of Grok Studio 207 сл.
  21. 14:31 Testing Grok Studio 120 сл.
  22. 15:10 Creating an SEO keyword tool 232 сл.
  23. 16:17 Google Drive integration 147 сл.
  24. 17:04 Building interactive tools 166 сл.
  25. 17:54 Workspaces and automations 129 сл.
  26. 18:37 Google Gemini 2.5 Flash 82 сл.
  27. 19:01 Thinking budgets explained 168 сл.
  28. 19:57 Cost comparison (20x cheaper than Claude) 180 сл.
  29. 20:55 Pricing on Open Router 119 сл.
  30. 21:30 Testing coding capabilities 223 сл.
  31. 22:40 UI and content creation 322 сл.
  32. 24:16 Advanced features and speed metrics 287 сл.
  33. 25:56 Other major AI news this week 126 сл.
  34. 26:39 Closing thoughts and resources 227 сл.
0:00

Introduction

This week has been absolutely mindblowing for AI news. We've got three massive releases from the biggest players in AI that will completely change how you work and make money. I'm going to break down everything you need to know and show you exactly how to use these tools today. By the way, if we haven't met already, I'm the digital avatar of Julian Goldie, CEO of SEO agency Goldie Agency. While he's out helping clients get more leads and customers, I'm here to help you get the latest AI updates. Let's get into it.
0:30

OpenAI's O3 and O4 Mini models

First up, chat GPT's groundbreaking 03 and 04 Mini, the new AI thinking revolution. A massive update this week is the release of OpenAI's 03 and 04 Mini models. Chat GPT called them our smartest and most capable models to date with full tool access. If you're on Chat GPT Plus, Pro, or Team Plans, you can access these models right now by clicking the model selector in the top left. You'll see 03, 04 Mini, and O4 Mini High options available. The biggest
0:58

Main features of the new thinking models

deal about these models is that for the first time these reasoning models can agentically use and combine every tool within chat GBT. This means they can use web search, analyze uploaded files, process data, and handle images all within the same thinking process. According to the benchmarks, open AIO3 is the most powerful reasoning model for coding, math, and science. It performs incredibly well on visual tasks like analyzing images, charts, and graphics, making fewer mistakes than OpenAI01. You might be wondering why O3 seems more advanced than 04 Mini. Open AAI explained that O04 Mini is a smaller model optimized for fast, costefficient reasoning. Um, it achieves remarkable performance for its size and cost, especially in math, coding, and visual tasks. I tested the game creation
1:43

Testing O3 with game creation

capabilities of 03 with the same P5GS dinosaur game prompt we used earlier. The model thought for about 9 seconds before generating the code, which worked perfectly on the first try. Compare that to Gemini, which took 23 seconds to think about the same prompt. Free users aren't left out either. They can try 04 Mini by selecting think in the composer before submitting their query. There are rate limits across all plans, but they remain unchanged from before. I also noticed Sam Alman, CEO of OpenAI, said 03 and 03 Mini are super good at coding and announced they're releasing a new product called Codeex CLI to make these models easier to use. This is a coding agent that runs on your computer and is fully open source and available today. The models can now handle much more complex reasoning with images. For example, when someone uploaded a photo and asked, "Can you find the name of the
2:29

Ship identification example

biggest ship you can see and where it will dock next? " The model reasoned for 93 seconds before providing an accurate answer with detailed explanation. With this release, OpenAI is really doubling
2:39

Reasoning models explained

down on what they call reasoning models. Models that take time to think through problems step by step before answering. While some people are calling this AGI, I'd say it's still far from it. The UI capabilities, for instance, are basic compared to Claude 3. 7 Sonet or Gemini. I tested this by having 03 MiniHigh create a landing page, and while it only cost 0. 07, the result looked very basic. But for tasks that require deep reasoning, these new models are definitely worth trying. Just be aware that they're slower than non-thinking models, so there's a trade-off between quality and speed. OpenAI also dropped
3:13

GPT-4.1 API release

GPT4. 1 in their API and it's going to change everything for developers and business owners who build with AI. The first big thing to note about this whole update is that it's specifically focused on coding. It's currently only available in the API, not in the chat GPT interface. If you go to chat GPT right now and check the drop-own menu, you won't find GPT 4. 1 there. You'll still see GPT 4. 5, but no sign of 4. 1 yet, which is pretty weird. Sam Alman tweeted
3:42

Sam Altman's announcement

about this just hours ago, saying GPT4. 1 and Mini and Nano are now available in the API. These models are great at coding, instruction following, and long context, 1 million tokens. Benchmarks are strong, but we focus on real world utility, and developers seem very happy. GPT4. 1 family is API only. I'm going to show you how to build absolutely anything with GPT4. 1 and even how to get free access to this model. Despite being API only, there's a clever workaround to use it without paying a dime. And I'll reveal that shortly. Open AAI has launched three new models in the API.
4:16

Three new API models (4.1, Mini, Nano)

GPT4. 1, the full powered model, GPT 4. 1 mini midsize version. GPT4. 1 Nano, smallest, most efficient version. These models outperform GPT 4. 2 and GPT 4. 0 O mini across the board with major gains in coding and instruction following. They also have massive context windows.
4:37

Million token context windows

We're talking about a million tokens, which is roughly 750,000 words. That's absolutely huge compared to previous OpenAI models. From my testing, this thing is ridiculously fast at replying. The response time is noticeably better than previous models. If you're
4:53

GPT-4.1 benchmarks

wondering about the benchmarks, here's where things get really interesting. GPT4. 1 scores 54. 6% 6% on S SWBench verified, improving by a massive 21. 4% over GPT 4. 0 and 26. 6% over GPT 4. 5. That's right, is significantly better at coding than GPT 4. 5 for instruction following on the scales multi- challenge benchmark, which measures how well a model follows complex instructions. GPT4. 1 is superior to GPT 4. 0. When it comes to long context understanding on the video MME benchmark for multimodal long context understanding, GPT4. 1 scored 72% on the long no subtitles test. A 6. 7% improvement over GPT4. 0. The bottom line, this is a huge leap forward in performance, especially for coding tasks. How to access GPT 4. 1 for free, limited time. Now, let's talk
5:42

How to access GPT-4.1 for free

about how you can actually use these new models. One of the easiest ways to access GPT4. 1 is through open router. You can just go to open router and add the model you want to use. Open router offers three options. GPT 4. 1 GPT4. 1 mini GPT4. 1 Nano. The pricing varies significantly between these models. GPT4. 1 ton per million input tokens. GPT4. 1 mini 0. 4 Nano 0. 1 per million input tokens. All of these support a million token context window which is absolutely huge compared to previous context windows from OpenAI. But here's
6:19

Using WindSurf to access GPT-4.1

the big news that nobody's talking about. You can actually use GPT4. 1 for free through Windinsurf. They're offering free API usage of Chat GPT4. 1 for a limited time. If you go over to Windsurf, and I'm on the free plan myself, you'll see from the dropown that you can select different models, including GPT4. 1, which is free to use directly inside Windsurf. This is absolutely awesome for trying out this powerful new model without spending a dime. How does this compare against
6:44

Comparison with Gemini and Claude

Gemini 2. 5 Pro and Claude 3. 7 Sonic? My personal favorite for coding right now is actually Gemini 2. 5 Pro. Gemini is typically a lot faster, but you hit the limits faster as well. Real world performance and benchmarks on the S. Bench verified benchmark, which tests a model's ability to solve real coding issues using a code repository and problem description. GPT4. 1 achieves a score of 54. 6%. This is a significant improvement over GPT40 33. 2% and GPT4. 5 28% representing increases of 21. 4% and 26. 6% respectively. However, while this places GPT4. 1 ahead of previous OpenAI models, it does not lead the field. Google's Gemini 2. 5 Pro and Anthropic's Clawude 3. 7 Sonnet both score above 62% on this benchmark with Gemini 2. 5 Pro reaching 63. 8%. swbench verifies specifically measures
7:40

Detailed benchmark analysis

how well a model can understand existing code and generate targeted patches to fix issues rather than rewriting entire files. GPT4. 1 demonstrates notable improvements in this area, making fewer unnecessary edits, dropping from 9% with GPT4 to just 2% and handling diff formats more accurately. This makes it especially effective for developer workflows. For instruction following, GPT4. 1 shows a 10. 5% improvement over GPT40 on the scales multi- challenge benchmark, scoring 38. 3% compared to GPT40's 27. 8%. On the if benchmark, which measures compliance with explicit constraints, GPT4. 1 achieves 87. 4% versus GPT40's 81%. In terms of long context and multimodal understanding, GPT4. 1 supports up to 1 million tokens of context, eight times more than GPT40's 128K limit. On the video MME benchmark, which tests understanding of long videos without subtitles, GPT4. 1 scores 72%, a 6. 7% improvement over GPT40. A key practical benefit is that GPT4. 1 handles large files and code bases more efficiently and accurately, making it suitable for use in tools like Visual Studio Code and Windsurf. Windsurf's internal testing reports that GPT4. 1 scores 60% higher than GPT40 on their coding benchmarks, is 30% more efficient at tool usage, and is about 50% less likely to make unnecessary edits, resulting in cleaner, more efficient code and fewer iterations. However, GPT4. 1 does not outperform all competitors on every benchmark. For example, on the ADA Polyglot benchmark, which measures multilingual code editing, GPT 4. 1 scores 52% while Google Gemini 2. 5 scores 73%. Models like Gemini 2. 0 Flash and 03 Mini also offer better cost effectiveness and in some cases higher accuracy. In real world use, internal comparisons show that websites generated by GPT4. 1 are preferred 80% of the time over those produced by GPT40 with improvements in UI quality and code cleanliness. Codo's analysis found GPT4. 1 provided better suggestions in 55% of GitHub pull requests reflecting its gains in practical coding scenarios. In summary
9:54

Summary of GPT-4.1 improvements

GPT4. 1 delivers substantial improvements over GPT40 and GBT4. 5 in coding, instruction following, and long context tasks. But it does not surpass the very top models from Google and Anthropic in every category. Its strengths are especially notable for developers needing reliable code editing and large context support. What's going updates
10:16

API updates for developers

for developers? OpenAI also announced they're going to begin deprecating GPT4. 5 preview in the API. Why? Because GPT4. 1 offers improved or similar performance on many key capabilities, but with lower costs and lower latency. This is something to bear in mind if you've built applications around GPT 4. 5. It will be turned off in 3 months. You'll need to transition to GPT 4. 1 or another model. As OpenAI stated, we'll continue to carry forward the creativity, writing quality, humor, and nuance you told us you appreciate in GPT4 into future API models. So, they're keeping all the strengths while improving the technical capabilities. Beyond these API models, OpenAI also
10:59

Building with GPT-4.1 in WindSurf

released some massive updates to their thinking models. They've launched 03 and 04 Mini, which are thinking models that integrate tools directly into their chain of thought. These models can access images, write Python code, search the web, and much more during their thinking process. Matt Wolf called this one giant step closer to AGI, and many people agree. What's interesting is that OpenAI has stated, "We'll continue to carry forward the creativity, writing quality, humor, and nuance you told us you appreciate in GPT4 into future API models. " So, they're clearly focusing on maintaining the creative strengths while improving technical capabilities. I decided to test this model using Windsurf for free because I know that's what many of you want to see. I don't know why OpenAI gets a lot of hate these days, to be honest, but I gave this a quick try. I opened up a folder and set up a new folder to create a simple 3JS runner game. I grabbed one of the prompts from the AI profit boardroom, which is the place to go for prompts, by the way. I created a dinosaur game using this prompt. Make me a captivating endless runner game. Key instructions. P5. js scene, no HTML. I like pixelated dinosaurs and interesting backgrounds. With Windsurf, I made sure to use the right section and selected GPT4. 1 from the drop down. When I entered the prompt, it started generating immediately. The request took a bit longer than expected, I imagine, because so many people were jumping on Windfur and using it for credits, but I got the response back quickly. I received the p5 js code, grabbed it, went over to the p5 js editor, hit play, and there we go. The game was ready to go. It was incredible. Really fast, and we built a game for free using p5 js and wind surf. That's basically it. That's how you can use this whole model. It's pretty simple and easy to build with. We'll build some more stuff in a second, but there will be more videos coming on how to build things with chat GPT. And I'll put all the prompts inside the AI profit boardroom. Grock 3's new studio
12:57

Grok 3's new Studio mode

mode, chat GPT canvas for free. Now, let's move on to our second massive AI update this week. Grock 3 has a brand new update called studio mode. And it's essentially chat GPT canvas that you can use for free. With studio mode, you can generate documents, code, reports, and even run browser games directly in your browser. I'm going to show you live previews of exactly how that works. Today, the best part, it's actually free to use and you can get it at grock. com. I'll show you exactly how to use all the features inside. Grock Studio is
13:29

Features of Grok Studio

incredibly similar to OpenAI's Canvas. It pushes the chat to the side and opens a new window on the right. It even includes version history. So if you create a new version, you'll see it in the dropown. And if you're creating HTML, you can immediately see a preview of the actual web page. There's also a new workspaces feature here, which is kind of like clawed projects. And on top of that, you can go to the drop down in the attach section, click on connect Google Drive, and you can actually create files directly on your Google Drive that you've generated from Grock. Lots of cool updates here. For example, if you ask Grock to generate code, you can see how it runs in the preview section. The UI is a lot nicer than something like Chat GPT canvas, which we'll compare in a second. Like I was saying, there's Google Drive support, so Grock users can now attach files from their Google Drive and work with document, spreadsheets, slides, etc., which is all pretty cool. Again, this is free. Free users are limited to 10 prompts every 2 hours. And the context window is 128,000 tokens, which is still substantial. Using Grock Studio mode.
14:31

Testing Grok Studio

When I tested out studio mode, I went over to Grock and said, "Using one of my favorite prompts for generating SEO content. The preview mode appeared immediately. " This new studio mode from Grock is impressive. One cool thing is that you get version history. So, when you create a new version, you'll see it in the drop down. Plus, if you're creating HTML, you can use that to generate a preview of the actual web page. Free users are limited to 10 prompts every 2 hours and the context window is 128,000 tokens, which is still pretty substantial, but for most simple projects, this is more than enough to build something useful. I tested creating an SEO keyword
15:10

Creating an SEO keyword tool

tool, an interactive music keyboard, and processing YouTube transcripts. Grock Studio handled all of them easily. The UI is much nicer than Chat GPT Canvas in my opinion. Let me show you some examples of tools that you can build with this. For example, let's say you just want to create something basic like an SEO keyword tool. Grock will generate the index HTML and it's using JavaScript and Tailwind CSS. Once that's done, you'll see the preview on the side and from there we can start using it however we want. If we put SEO link building and click generate keywords, you can see that we get the keyword, search volume, competition, and that's all available to preview right away. If you want to download the code, you can download it to a file right there. And then you can host it on something like Netlefi. We can also modify our request and say make this allow me to enter my own API key plus add a nice color gradient background. And you can see that's now generating on the right hand side. If we click on version history, we can switch between the two different versions which is super useful. You can copy the code, download it, and even refresh it in case it breaks. You can also edit directly inside the canvas. Google Drive integration. I also tested connecting
16:17

Google Drive integration

Grock to Google Drive. When I clicked on add from Google Drive, I was able to select files directly from there. I attached a file from Google Drive and it synced immediately. I tried to test if I could create Google documents directly by asking it to create a colorful Google PowerPoint directly from this and turn it into a beautiful simplified Google Slides. It read the attachments, but unfortunately this feature didn't seem to work properly. Despite GR claiming to have Google Slides and Google Docs integration, I couldn't get this particular function to work at all. I even tried adjusting the settings inside site settings, allowing pop-ups and redirects, then refreshing the page to see if that would fix the issue, but no luck. This is good for you to know because you can see what happens and how to troubleshoot if it breaks. Creating
17:04

Building interactive tools

interactive tools. Let's try something else. We'll say create an interactive music keyboard tool. It should generate the code right here. And there we go. We've got the interactive music keyboard. We can switch between the code and the preview. And if we want to make any changes, we can just type in the bottom left. For example, if we say make the background colorful and interesting, Grock will update the code to implement that change. This is basically like a free version of Bolt. Honestly, I've tested other tools like Google Firebase recently, and if you look at my tests, Bolt and Lovable are still winning overall. But if you can use Gro Studio for free and you get 10 generations per day, that's more than enough for most people to create their own tools. I wouldn't see why you'd need Bolt anymore unless you need to create a massive project, but for a simple one-page HTML tool, this is good to go. Workspaces and
17:54

Workspaces and automations

automations. Another new feature is workspaces where you can create your own projects. For example, my team uses this YouTube transcripts feature all the time. If we take the prompt from that and plug it into the workspace, edit the instructions, paste them in, hit save. We can also add attachments right here and see previous conversations. If we take an example YouTube transcript, plug it in, copy that. If you like automations like this, by the way, check out the AI profit boardroom, which comes with all my best tools. Paste in the transcript and hit enter. Without any additional prompting, it's going to start generating blog content based on the custom instructions I've given Grock. That's pretty powerful. Google Gemini 2. 5 Flash, cheaper than ever. The
18:37

Google Gemini 2.5 Flash

third major update this week is from Google. They've released Gemini 2. 5 Flash. This is a brand new update that's currently in preview mode, and you can get access to it via the Gemini API in Google AI Studio or Vert. Ex AI. We'll be comparing the benchmarks in a second, but one key thing to note is that 2. 5 Flash is a thinking model, but you can actually switch this off inside Gemini directly. Google calls it their first
19:01

Thinking budgets explained

unified reasoning model with thinking budgets. This is a big deal because it gives developers fine-grain control over how the AI thinks. As they explain, building on the popular foundation of Gemini 2 Flash, this new version delivers a major upgrade in reasoning capabilities while prioritizing speed and cost. This means you get most of the benefits of a thinking model with the speed of a non-thinking model when you need it. If we go to a studio. google. com, google. com. You can start using it right away. From what I've seen, it is actually free to use for a million tokens. There's a token count displayed, but it doesn't seem to charge me anything for using it, even though I believe I've got my credit card information saved. You can switch between thinking mode or non-thinking mode, which makes it really flexible. Google says, "Building on the popular foundation of Gemini 2 Flash, this new reasoning capabilities while prioritizing speed and cost, benchmark comparisons, and cost
19:57

Cost comparison (20x cheaper than Claude)

efficiency. " I grabbed some prompts from the AI Profit Boardroom link in the comments and description if you want to get these and tested them out with Gemini 2. 5 Flash. I tried, can you create a simple runner game in 3JS with thinking mode on? Since this is a coding task that requires reasoning, the model handled it well. When looking at the price, the differences are shocking. Claude 3. 7 Sonic costs about $3 per million tokens, while Gemini 2. 5 Flash is just 0. 15 per million tokens. That's a 20x price difference. It also performed really well on benchmarks like Humanity's Last Exam and GPQA Diamond, comparable to much more expensive models. Google says Gemini 2. 5 models are thinking models on complex tasks that require multiple steps of reasoning. The thinking process allows the model to arrive at more accurate and comprehensive answers. In fact, Gemini 2. 5 Flash performs strongly on hard prompts on LM Marina, second only to 2. 5 Pro. But one of the biggest differentiators about this model is that
20:55

Pricing on Open Router

it's a lot cheaper if you're coding with it directly. If you're using the API directly from something like open router, it's way cheaper. For example, if we look at the models available on Open Routter, we can see Google Gemini 2. 5 Pro Preview is available. If we click on this, we can see how much it costs. It's pretty cheap per million tokens. Gemini 2. 5 flash preview is even cheaper. This is the one we want to use. If we select 2. 5 flash preview, we can see it costs just 0. 15 per million input tokens and 0. 6 per million output tokens. It was created on April 17th, so it's brand new. Testing coding
21:30

Testing coding capabilities

capabilities. I tested the coding capabilities by having it create a simple runner game. When I got the code, I previewed it on piggs to see how it performed. Since it generated HTML, I could just run the code directly using LiveWeave. I got the car ready to go, though it seemed to be sideways at first. When I ran it, the code worked perfectly. The car itself wasn't great, but we created a 3D game in literally 2 seconds with one single prompt. I tried going off-road, and the car was flying. Absolutely flying. Not bad at all. When comparing the models, I looked at Gemini 2. 5 Flash versus 2. 0 Flash, Open AI, Claude 3. 7 Sonnet, Grock 3, and Deepseek R1. The cost differences are insane. Clawed 3. 7 Sonnet at 32 per million tokens versus Gemini 2. 5 Flash at just 0. 15 per million tokens. It's performing pretty well on benchmarks like humanity's last exam and GPQA Diamond. I always take these benchmarks with a pinch of salt, but it seems pretty good at coding from what I've seen so far. Basically, this is the most costefficient thinking model right now. You can code with thinking models, but this one is a lot cheaper, which makes it very attractive for developers. UI and content creation performance. What's
22:40

UI and content creation

interesting about Gemini 2. 5 Flash is that Google has implemented a thinking budget that offers fine grain control over the maximum number of tokens. This means you can fine-tune and set a budget for how much thinking you want it to do. The higher the budget, the more comprehensive but potentially more expensive the thinking will be. So if you have prompts that require low reasoning, medium reasoning, or high reasoning when you're coding, you can adjust the budget accordingly. If something just requires a little bit of reasoning, then you can lower the budget and spend fewer tokens, which saves you money. Let's test how Gemini 2. 5 Flash performs with UI creation. I'll grab this prompt from the AI profit boardroom to generate a landing page and see how that performs. It's pretty fast. You can see the cost here. It's ridiculously cheap. It's cost us just 0. 083 to run this and we've built a website. Let's open that up and see what we've got. The website is ready to go. Obviously, it requires a lot more back and forth than that for a production ready site, but just to build a simple website. It's pretty fast and easy to do and super cheap. I wouldn't say it's as good as Claude 3. 7 Sonet for building things, but it does the job. Let's improve the UI and make it more colorful, sleek, and modern. I'm impressed with the cost and the speed, but not impressed with the UI so far. I still think Gemini Pro would do something a lot better than that. For content creation, Gemini 2. 5 Flash also performs well. When we tested it with an article prompt, it blasted out 2,000 words in one shot. That's impressive. The content reads naturally and passes AI detection tools, though not as well as some other models. Advanced features and use cases. Let's try something more
24:16

Advanced features and speed metrics

advanced. I'll ask Gemini 2. 5 Flash to create an interactive water molecule simulation using canvas. This worked perfectly. We can speed it up, slow it down. Everything functions as expected. Let's try building Flappy Bird next. Another classic game. Looking at the speed metrics is also impressive. Gemini 2. 5 Flash can process about 142 199 tokens per second compared to Claude which is around 53 tokens per second and GPT4. 1 at about 82. 4 tokens per second. This makes it blazing fast for real-time applications like chat bots. For math tasks, it's also impressive. On the AMM math test, Gemini 2. 5 Flash scores 78% which is a huge improvement over Gemini 2. 0 Flash. Only OpenAI's O4 Mini beats it at math with a score around 90%. But again, at a much higher price point. If you want to see real business use cases, because these prompts are fun, but not necessarily business focused, check out the crash course inside the AI profit boardroom classroom. There you'll see different methods we've used to automate processes like emails, customer service, sales team communications, Twitter automations, newsletters, shorts, AI avatar videos, Fathom, Zapia integrations, and much more. From what I've seen, Gemini 2. 5 Flash is okay. It's not as good as Gemini 2. 5 Pro experimental. It is cheap for developers, and you can use it for free inside AI Studio. It can give decent outputs like the interactive water molecule simulation we just created. I probably wouldn't use it for professional video production as it might let me down, but it could be really good for notional workflows because it's super fast and cheap to use. Other major AI news this week. In
25:56

Other major AI news this week

our limited time, I couldn't cover everything, but there's been so much more happening in AI this week. Apparently, OpenAI is working on an X-like social media network. With Sam Alman privately asking people for feedback, OpenAI has added a new library tab to chat GPT where you can see all your generated images in one place. Microsoft is adding computer control features to Copilot Studio. Claude from Anthropic now has a research feature with Google Workspace integration. Luma Dream Machine added camera angle controls to their video generator. Cling 2. 0 launched with massively improved video generation capabilities. If you're interested in any of these topics, let me know in the comments and I might do a deep dive on them next time. Thanks so
26:39

Closing thoughts and resources

much for watching. If you want to get all my automations and practical use cases of how to grow your business and make more money while saving time with AI, feel free to join the AI profit boardroom. The link is in the comments and description. It comes with all my best automations for email, social media, and everything I'm actually using inside my business. You'll get automation templates, my crash course based on everything we use, and access to weekly Q& A calls. If you ever get stuck or lost, you can jump on the Q& A and ask any questions you have inside the community. If you ever need help, you can ask our 670 members for tips. We're constantly updating with tons of new content every week. Additionally, if you want a free one-to-one SEO strategy session, feel free to get that. We'll show you how we take websites from zero to 145,000 visits a month and generate hundreds of thousands of dollars on autopilot with SEO. We'll give you a customtailored game plan and you'll learn exactly what's working for us based on all our happy clients. Julian Goldie reads every comment, so make sure you drop your thoughts below. I'd love to know which of these three new AI tools you're most excited to try. I'll see you in the next one.

Ещё от Julian Goldie SEO

Ctrl+V

Экстракт Знаний в Telegram

Транскрипты, идеи, методички — всё самое полезное из лучших YouTube-каналов.

Подписаться