I just got back from Google IO and want to share my thoughts. Google is a sponsor and I really want them to succeed. But I want to be honest with you all about what I think is working and what isn’t with Google’s strategy.
TIMESTAMPS
(0:00) Google's biggest problem and the 3 races it needs to win
(1:00) Race 1: Evolving chat into a personal agent
(2:30) My honest take on Spark vs OpenClaw, Codex, and Claude Code
(4:30) Race 2: Coding and knowledge work
(5:30) Gemini 3.5 Flash performance and pricing
(7:00) Race 3: From text to multimodal
(11:00) My favorite exec at Google or any AI company
(12:30) 3 takeaways on what Google needs to do next
📌 Subscribe for more extremely practical AI tutorials and interviews:
https://www.youtube.com/@PeterYangYT?sub_confirmation=1
CONNECT WITH ME
Newsletter: https://creatoreconomy.so/
X: https://x.com/petergyang
LinkedIn: https://www.linkedin.com/in/petergyang/
Оглавление (8 сегментов)
Google's biggest problem and the 3 races it needs to win
Hey everyone, I just got back from Google I/O, and I want to share my thoughts on Google's AI strategy. Now, Google is a sponsor of my content, and I really want them to succeed, but I want to be honest with you all about what I think is working and isn't working. Here are all the AI products that they launched at I/O. Now, this looks like a flex, but I think it's actually a problem. Google is launching so many AI products that users don't really know where to start anymore. I think this tweet from Nathan says it all. Google has Gemini, AI Studio, Anti-Gravity, Spark Flow, Stitch, Pomelo, and more. It's getting really confusing for consumers and enterprises which product to use for what. So, overall, I think Google needs to focus its products on winning three AI races that really matter right now. The race to evolve chat to a personal agent, the race to build a super app for coding and knowledge work, and the race to expand beyond text to multimodal, images, videos, and more.
Race 1: Evolving chat into a personal agent
So, let me walk through each one. First, the race to evolve chat to a personal agent. You know, I think the AI chat era is coming to an end. People don't want AI that just replies in chat, they want an AI that can actually get work done for them. That's why I'm convinced that AI personal agents are going to be a massive market, probably $1 trillion or more. Not everyone wants to write code stuff, but everyone really wants a personal agent or chief of staff that can do work for them. So, here's my mental model of the personal agent landscape, and I've tried them all. You know, on the one end, you have Open Claw and Hermes. These agents live in your messaging apps, they're fully customizable, and Open Claw especially pioneered this whole category. Right now, I use Hermes daily for emails, calendar, weekly reports, and more. And in the middle, you have Codex and Claude Code. These products are backed by companies, OpenAI and Anthropic, and they're rapidly adding personal agent features, like the ability to add any API, to run cron jobs, and more. But, using them, they still feel like coding tools primarily and personal agent second. And then over here, you have Google and Gemini on the right. And the fact is that Google already has all my personal contexts. My emails are in Gmail, my calendar is in Google Calendar, my documents are in Google Docs, and all of this stuff lives in Google Drive, right? But for the longest time, the Gemini app
My honest take on Spark vs OpenClaw, Codex, and Claude Code
couldn't do basic things like edit a Google document, which was just really frustrating to me. That's why I'm really excited for Google announcing Spark, which is their version of a personal agent. So, Google's vision for Spark is to build a personal, proactive, and powerful agent. And let's kind of break down what each one means. Personal means understanding you through Gmail, Calendar, Workspace, and Drive. Google already has a massive head start here. Proactive means telling you what matters proactively. There's a new daily brief feature that they are shipping that surfaces, for example, what matters across all of Google's apps, which I think could be really awesome. And finally, powerful means being able to use Google's apps, but also any third-party API or MCP to actually get work done. And I particularly love the fact that Spark is in the cloud on a virtual machine, so you don't have to keep your laptop lid open to use it. Now, I had a great chat with Chris at I/O, and Chris is the head of product for the Gemini app. And I asked him point-blank, "When can we expect to hook up Spark to any API or MCP that we want? " Which I can already do with Open Claw, Code X, Claw Code, and Ermes. And his answer was that for any kind of write action, action where you're asking the agent to update some information, the agent should probably ask the user for approval first. And look, I totally get it. Gemini has 900 million users, and you don't want someone to accidentally delete all their files, right? But I think this is being a little bit too safe. I think Gemini should let users decide how much control to give up to their agents, whether it's asking the user for permission each time, or just bypassing all permissions like I do in Codex and Claude Code. It It It's just like really annoying to sit there and hit yes, yes each time your agent asks you for permission. So, with these other apps, I always just hit bypass permissions, and I feel like
Race 2: Coding and knowledge work
the model is smart enough that it's not just going to randomly delete all my files or do something that is completely broken. So, I think, you know, Google should trust its users, and the bottom line here is that Google needs to move fast here because it has all the contacts, it has the advantage in that, but I think the agents is not as powerful as some of the other competitors, and it cannot afford to lose the personal agent race. Okay, so now let's talk about coding and knowledge work. I think Google is playing catch-up here in the AI front, even though it has, you know, quite a monopoly on knowledge work in terms of Google Workspace. Now, what's been happening is that the AI native builders that I know have all largely switched over to Codex because of the generous rate limits, the great app and GPT 5. 5, which is a awesome coding model. And meanwhile, enterprises have largely switched over to Claude Code because Anthropic has done an incredible job riding the Claude Code hype cycle and
Gemini 3.5 Flash performance and pricing
just driving enterprise adoption. So, where does that leave Google? Let's talk about the company's new coding model and Harness. Okay, so first, Gemini 3. 5 Flash is a new model, and based on these benchmarks, it looks like it's pretty right? It's outperforming Gemini 3. 1 Pro on a lot of things. In some cases, it actually even outperforms GPT 5. 5. But, the catch is, of course, that the pricing has also gone up. Although, Gemini 3. 5 Flash is still a lot cheaper than the other frontier models. You can see here that Gemini 3. 5 Flash has a $1. 50 input price and a $9 output price compared to, for example, GPT, which has $5 input price and $30 output price. And Claude Opus is the most expensive. You know, this actually matters a lot because enterprises are running out of budget with these expensive frontier models. They're looking for just good enough and cheaper models to do the majority of their agentic work instead. Okay, so now let's talk about the harnesses. I tried the new Anti-Gravity app, and it looks pretty slick. But, it also looks very similar to the Codex and Claude Cole harnesses, where you have this left nav to talk to your agents. So, let's just take a look. This is Anti-Gravity. This is the Claude Cole app. And here we have the Codex app. And you can see how similar all three apps look, right? They all have the chat uh threads on the left side, and you can talk to the agent here, and then maybe
Race 3: From text to multimodal
there's a browser preview or something else here. But, what I kind of wish to see was actually some sort of innovation in the Anti-Gravity harness. For example, I think this UI works great if you are just like a single person talking to your agents. But, if you are a team or if your organization that is trying to collaborate with agents and humans at the same time, I think this UI starts falling apart. Now, the other thing I will say is that Google just has too many harnesses. You know, for example, I really don't understand why Stitch, Google's AI design tool, is a completely separate product from Anti-Gravity. When I build a product, you know, I want to plan it, design it, and then build it all using one tool. I shouldn't have to switch between three different Google apps to do that. Now, in a time when OpenAI and Anthropic are building super apps where one tool can handle coding, design, and knowledge work Google should work hard to make Anti-Gravity that super app, in my opinion. I think this is actually pretty controversial because Google is also adding AI chat to Docs, Slides, Sheets, Gmail, Calendar, and all the other knowledge products that it owns, right? And each of these products, let's face it, has a billion users or hundreds of millions of users. But, I think the future is we're all just going to interact with one agent and super app that really understands us to get both coding and knowledge work done. Maybe we'll go into these other apps and services to manually tweak stuff at the very end. But, the single agent will do most of the job. So, this is actually a existential threat for Google, in my opinion. Uh they really need to build a super app or super agent, you call it, faster than the competitors so that they still own the customer relationship. And I think Anti-Gravity has a lot to live up to. So, I'm hoping it becomes incredible very fast. Okay, so I being a bit critical about Google's AI coding efforts let me end where I think Google is generally ahead, which is multimodal AI. When we talk to each other as humans we don't just send text messages to each other, right? We switch between text sometimes we do voice replies or calls, sometimes we want to record a video with our camera and communicate that way. And unless they really screw something up, I feel like Google is going to win consumer AI. Because it's the only US lab that's actively building competitive video models and consumers love video. After all, both TikTok and YouTube are far more popular than any text-based platform. And the only real competition I think Google has in video right now, is xAI, which is also has some video models, and uh Chinese video models like SeaDance that don't seem to really respect copyright laws. So, I think Google has a massive chance to win here. And I'm also really excited for the new Omni model that they're announced, which lets you take any input to generate any type of output. So, for example, I can use voice to provide input, and it can come out as video, or images, or anything else, right? But even here, I think Google has too many products. For example, Flow that I'm showing here is, in my opinion, Google's best product to generate images and video. And you can make some pretty amazing scenes with it. But you probably haven't heard of Flow yourself, right? So, why isn't this stuff just part of the default Gemini app experience? Why is there a separate app? And an another pet peeve of mine is that the number one use case, in my opinion, for editing images and videos, at least for consumers, is changing family photos and videos. But there's just like a lot of safety and privacy restrictions in place. So, whenever I want to edit or upload a video of my kids, Gemini doesn't let me actually do it. I totally get it why it's not allowed, but just as a parent, it's my number one use case. So, I hope Google can figure out when a parent wants to edit their kids' videos versus some stranger.
My favorite exec at Google or any AI company
Okay, so I can't end this video without glazing this man. Uh this is Josh Woodward, who is the VP in charge of Gemini Labs, Notebook LM, and a bunch of other Google products. And he's probably my favorite exec at Google and possibly any other company. Google historically has been known for a very bureaucratic culture, a lot of planning, not a lot of shipping. And I think in certain departments, that's still probably the case. But Josh has really changed the culture around it. Like everything this man says, I just fundamentally believe in to my balls, right? So, uh we had a session with Josh at IO and he said stuff like just try a lot of stuff and build to learn, which is like the Labs ethos. Uh someone asked him a question about how much planning does he do? And he said, we only have a 90-day road map and maybe if we're lucky, it's 120 days. So, there isn't a ton of planning theater. And he also said, I don't know if we will ever go back to 1-year road maps. I haven't been working on a 1-year road map for 5 years. You know? And that's the culture that I think you really need to win in the AI space because there's so much change happening. You have to focus on velocity over planning, prototypes over documents and decks. Chris, the Gemini VP who I also talked to, who reports to Josh, told me that his team caps PRDs at one page and runs meetings using AI Studio and anti-gravity prototypes instead of presenting docs. That's what you really need to win in this space. So, I'm really encouraged by
3 takeaways on what Google needs to do next
the culture that Google is building internally at Gemini. All right. Well, let me close with three takeaways. Again, there's a whole bunch of other things that AI can transform like health and other things that are really important. But just in terms of pure competition, I think there's really three AI races right now. There's a race to convert chat to a personal agent. And Google has the data, the products, the model, and now they have Spark. So, this is their race to lose. But what they need to do, I think, is to trust users to unlock third-party APIs and MCPs, not gate everything behind a bunch of permission prompts, and also maybe let users personalize their agent. Maybe I don't want to call my personal agent Spark. Maybe I want to call it Zoe or some other name, right? Let me make my agent truly mine. The second race is the expansion from coding to all knowledge work. And I generally think that Google is behind here. Uh Uh, but anti-gravity is the right bet. They need to consolidate all the stuff around it, build a super app that has a really good, cheap, and fast model in Gemini Flash, and extend it into knowledge work. So, don't just add chat windows to every Google product like Slides and Docs and stuff like that. You can do that, but that's kind of like the foundational stuff. The more important effort is to make anti-gravity the best way for agents to actually uh build apps, but also do all kinds of knowledge work. And last but not least, to transition from text to multimodal. And this is one area where I think Google is far ahead. Their video models are incredibly good. They own YouTube. They have a great image model in Nano Banana. They just have to again consolidate the tool set so that I can just use the regular Gemini app to make all this amazing content. Maybe perhaps relax the privacy restrictions so that I can make stuff of my family and friends. And just focus, right? Don't just build a whole bunch of separate apps to do this stuff. Focus on making Gemini, on making anti-gravity the best damn apps that they can be. And I'm really rooting for Google here, not only because they're a sponsor, but because, you know, they have amazing talent. Uh, I love Josh and his team. The data's there, the infrastructure is there. They have the full stack. And the most important piece that they're building right now is the culture, right? The culture is also, I think, improving. But I think they just need to focus. Focus and ship. So, that's my review of IO. And if you want more honest reviews and deep dives like this, please like and subscribe, and I'll see you next time.