GPT-5.2 Just Launched - Here's How to ACTUALLY Use It (vs Gemini 3 Pro)
17:53

GPT-5.2 Just Launched - Here's How to ACTUALLY Use It (vs Gemini 3 Pro)

Vaibhav Sisinty 12.12.2025 46 255 просмотров 1 304 лайков обн. 18.02.2026
Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
🔗 Join our WhatsApp Community Get the latest AI updates, tips, and insights straight to your inbox: https://dub.sh/ai-updates-vs Google dropped Gemini 3 Pro about 3 weeks ago and it topped every benchmark. LM Arena leaderboard? Number one. OpenAI panicked, called a Code Red internally, and rushed out GPT-5.2 just yesterday — claiming it beats Gemini on multiple benchmarks. 100% on AIME 2025 (first model ever). Better at spreadsheets. Better at presentations. Better at code. 70% on their GDPval benchmark — "human expert level" they say. But benchmarks are one thing. Real-world tasks? That's where it matters. So I tested both models on 10 actual startup tasks that founders deal with every day. Same prompts. Same criteria. And to avoid any bias (I've been using ChatGPT daily for years), I used Grok and Claude as judges to evaluate every single output. The results surprised me. One model caught a trademark conflict that could've cost lakhs in legal fees. One opened emails with "I hope you're doing well" (instant delete). One literally gave up on a handwriting test. And one built a playable 3D Mario game with enemy AI and win conditions from a single prompt. Everything is in front of you. No bias. Just results. 0:00 - Intro 1:09 - Test 1: Business Model Canvas 3:36 - Test 2: Pitch Deck 6:25 - Test 3: Revenue Spreadsheet 7:28 - Test 4: Landing Page 8:46 - Test 5: Cold Email 10:13 - Test 6: Content Calendar 11:38 - Test 7: Resume Analysis 12:44 - Test 8: Mario Game 14:32 - Test 9: Handwriting Recognition 15:44 - Test 10: Floor Plan Analysis 17:11 - Final Verdict -------- To Know More, Follow Vaibhav Sisinty On ⤵︎ Instagram @VaibhavSisinty https://www.instagram.com/vaibhavsisinty Twitter @VaibhavSisinty https://twitter.com/VaibhavSisinty Facebook @VaibhavSisinty https://www.facebook.com/vaibhavsisinty/ LinkedIn - Vaibhav Sisinty https://www.linkedin.com/in/vaibhavsisinty

Оглавление (12 сегментов)

  1. 0:00 Intro 178 сл.
  2. 1:09 Test 1: Business Model Canvas 437 сл.
  3. 3:36 Test 2: Pitch Deck 456 сл.
  4. 6:25 Test 3: Revenue Spreadsheet 160 сл.
  5. 7:28 Test 4: Landing Page 193 сл.
  6. 8:46 Test 5: Cold Email 225 сл.
  7. 10:13 Test 6: Content Calendar 210 сл.
  8. 11:38 Test 7: Resume Analysis 166 сл.
  9. 12:44 Test 8: Mario Game 292 сл.
  10. 14:32 Test 9: Handwriting Recognition 166 сл.
  11. 15:44 Test 10: Floor Plan Analysis 254 сл.
  12. 17:11 Final Verdict 114 сл.
0:00

Intro

Gemini 3 Pro dropped about 3 weeks ago, topped every benchmark. LaMarina number one. OpenAI called a code red and rushed out GPT 5. 2 and they're claiming it's better 100% on a IME, better at spreadsheets, better at code, better at presentations. So, I tested both on 10 real startup tasks to see which one actually delivers. business model canvas, pitch decks, landing pages, cold emails, even a full Mario game from scratch. One model caught a trademark issue that could have saved me lacks in legal fees. One opened with, "I hope you're doing well. " The other actually sounded human. One literally gave up on a test, and one built a playable 3D game with enemy AI and win conditions. Now, before you say I'm biased, I love Chat GPT. I use it every day. It's been my go-to for years. So to keep this fair, I used Grock and Claude as judges. They scored every output. No bias. Everything is in front of you. The final score, let's just say I didn't expect this.
1:09

Test 1: Business Model Canvas

All right, let's start building a startup from scratch. Same prompt, both models. Let's see who actually helps you build something real. I'm starting an AI education startup called Growth School. Build me a complete business model canvas with customer segments, value propositions, revenue streams, everything. Now, here's how I'm keeping this fair. I'm not just reviewing these myself. I'm feeding both outputs to Claude and Grock to judge. Same criteria for both. For Chad GPT, I'm using the pro model. For Gemini, Deep Think with three pro. Both models at their best. Let's hit enter and see what we get. Chat GPT is done. Customer segments, value propositions, channels. Okay, it's hitting all the sections of the business model canvas. It's giving us a 90-day checklist as well. It's solid. It's like textbook MBA stuff. Now, let's check Gemini still thinking. Look at this. Before giving me the business model, Gemini is throwing a critical strategic alert. Trademark conflict. It's saying growth school might already be trademarked. It actually is. Growth school is literally my company. I own that trademark. Chat GBT didn't say a word about this. It just started building. If this was a real startup idea and that name was taken, I could have wasted months building something I couldn't even legally launch. But it doesn't stop there. Look at the actual business model. It's not just theory. It's giving me a complete workshop funnel. Phase one, phase two, actual implementation steps. And this financial projections for the India market, 90-day revenue estimates. I didn't even ask for this. It just knew the context and gave me numbers I can actually use. Like this isn't a template. This is a game plan. All right, let's see what our AI judges think. First up, Grock. GMnet 3 Pro. What is GMet 3 Pro? It's Gemini 3 Pro. Okay, there we go. Winner Gemini 3 Pro. And look at the reasoning. Specificity, huge gap. Chad GPT followed the MBA playbook while Gemini followed the real playbook. Claude says the same thing. Gemini gives you numbers you can plug into a spreadsheet. Chad GPT gives you a template. So test one business model canvas. Clear winner Gemini 3 Pro. Not just because the business model was better, but because it actually thought like a founder. It checked the trademark. It gave me India specific projections. It didn't just answer the prompt. It anticipated what I'd actually need. Score GPT 5. 20 Gemini 3 Pro 1. Let's see if GPT can bounce back. Next up, pitch deck. If you're raising money, this is the test that matters.
3:36

Test 2: Pitch Deck

Gemini takes the first round, but pitch decks are different. This is about storytelling and design. Let's see who can actually make something you'd show to investors. Setting up the test. Same setup. I'm taking the business model we just created and asking both models to turn it into a PowerPoint presentation. Simple prompt. Make a PowerPoint presentation for this. Let's see what we get. Chat GPT is done. Let's open this up. Growth AI Academy. Problem statement. Our solution market opportunity traction. Okay, it has all the sections, but this design is really basic. plain white background, black text. This looks like a college assignment, not a VC pitch. Now, let's see Gemini's version. Okay, this is a completely different level. Dark theme, uh, proper branding, the AI forge. Wait, is this editable? It is editable. And look at this. It hasn't just given me editable text. It's added actual images. This is a stock photo of someone stressed at work for the AI anxiety problem slide. The solution, market opportunity with proper graphics, product ecosystem breakdown. There's this self-liquidating funnel slide. Okay, the image placement is a bit off here, but it actually has the funnel levels mapped out. 90-day execution road map, financial projection graph, and this fuel the forge slide explaining why we need the investment. I mean, visually there's no competition here, but let's see what the judges say. Uploading both presentations to Claude and Grock. The PPTX is chat GPTs. The PDF is Geminis's. Let's see Grock first. Chat GPT 49 out of 100. Gemini 89 out of 100. That's a 40 point gap. And look what Grock says about Chad GPT's deck. Straight to the trash. I think Elon has some feud with Sam Altman going on here. Now Claude Gemini wins by a landslide, but Claude actually caught something interesting. Claude says Gemini tells the story better, but Chad GPT actually remembered to include the funding ask amount. A detailed Gemini completely dropped. Let me check. Chad GPT's ask slide says funding needed 1. 5 cr. Yeah, that's actually there. Claude's recommendation. If I were the founder, I'd use Gemini's deck structure and storytelling, but keep chat GPT's ask slide. Test two pitch deck. Winner is Gemini 3 Pro. The design difference alone makes it a clear win. But here's the real takeaway. Gemini gives you the sizzle, the storytelling, the design. Chad GPT remembers the details. The best pitch deck would probably combine both. Score update. GPT 5. 2 still at zero. Gemini 3 Pro now at two. Chad GPT is down 0 to2, but the next test is supposed to be its strength, spreadsheets and financial modeling. Let's see if it can get on the board.
6:25

Test 3: Revenue Spreadsheet

Chat GPT is down 0 to2, but this should be its comeback. OpenAI specifically highlighted spreadsheet capabilities in the GPT 5. 2 launch. Simple test. Create a three-year revenue forecast model with MRR churn, lifetime value, break even. Give me a spreadsheet. Chad GPT gave me this a basic table. No formulas, no proper structure. This is what Openey showed in their announcement. This is what I got. Not even close. Gemini made a nice dashboard with charts, but can I export it to Google Sheets? No, it's just a visual. I tried multiple prompts, different approaches. Neither could give me an actual working spreadsheet. Don't need the judges for this one. Zero points to both. Score stays GPT0 Gemini 2 chat. GPT missed its chance. Next landing page. So, I've got a WhatsApp community where I drop everything I discover. tools, workflows, updates, all in one place before they go mainstream. If you want access, links in the description.
7:28

Test 4: Landing Page

Time to actually build something. We have the business model. We have the pitch deck. Now, we need a landing page. Same prompt to both. Build a landing page for growth school with hero section, features, pricing, the works. Gemini 3 Pro done. Full landing page already rendered. I can see it right here. Logo is there. Sections are there. It's ready. Let me push it further. Change the design to neo brutalism style and it updates. Neo brutalism with growth school's green color scheme. This is actually usable. Chat GPT. It gave me code and a download link. No preview. I asked for a preview. It said I cannot directly render websites. Tried pro mode. Tried canvas. Tried a fresh chat. Same answer every time. Download the file and open it yourself. I even tried LM Arena, the third party interface. It worked there, but it was slow. The point is official chat GPT couldn't do what Gemini did in seconds. This is a visual design test. You can see the winner. No judges needed. Gemini 3 Pro takes it. Score GPT 5. 2 still at zero. Gemini now at three. —
8:46

Test 5: Cold Email

— Next, cold emails. You've got your startup. Now you need customers. Can AI write emails that actually get replies? Three email sequence targeting HR directors. Initial hook. Follow up with case study. Soft close has to feel human, not AI generated. Chat GPT subject line helping your team thrive with personalized AI learning. And it opens with I hope you're doing well. Nobody reads past that line. That's the fastest way to get deleted. Gemini subject line L and D at and the completion rate problem opens with. I'll keep this brief because I know Q4 planning is likely hectic. See the difference? One sounds like every spam email you've ever gotten. The other sounds like someone who actually knows your job. Let's see what Grock thinks. Grock's verdict. Gemini wins. And it's not even being subtle about it. Claude scores chat GPT 44 out of 100. Gemini 88. Would you open it? Chat GPT. No, Gemini. Yes. Would you reply? Chat GPT. No, Gemini. Maybe. The gap between a template that sounds like every other cold email and a message that sounds like a real person who understands my job. Both judges agree. Gemini takes it. Honestly, I knew it the moment I read. I hope you're doing well. If you're doing cold outreach, Gemini actually writes like a human. Next content calendar.
10:13

Test 6: Content Calendar

You've got the emails. Now you need content. 30-day calendar for LinkedIn and Twitter targeting working professionals. Same prompt to both. Let's see who gives me something I can actually use. Chad GPT gave me a 30-day calendar day alternating between LinkedIn and Twitter. It's got content ideas, hooks, it's trying, but it feels like a template. Gemini starts with a strategy overview. Then it gives me a proper table. Day, platform, format, hook, thread structure, visual ideas. Both tried to give hooks, but look at the difference. Chad GPT gave me generic hooks. Gemini gave me scroll stopping hooks with platform specific optimization. It's not just a calendar, it's a content strategy. Claude's course, Chat GPT 44. 5. Gemini 89. Again, double the score. Claude says, "Chad GPT produced a calendar that looks complete but would perform poorly in execution. " Gemini shows real understanding of how content actually grows on these platforms. Grock calls Chat GPT's calendar excellent for consistent loweffort posting. That's not a compliment. Gemini wins again. Real world growth engine calendar. Gemini takes it again. That's GPT0. Gemini 550 5. At this point, I'm actually trying to find something Chad GPT wins at. It's not even close in any of these. What is going on?
11:38

Test 7: Resume Analysis

Okay, test seven. As we're expanding our startup, we're going to have to hire people, which means a lot of résumés. Let's see how good these models are at finding flaws. You can even use this yourself. Run your resume through these before sending it out. I grabbed a random resume from the internet. Same image, same prompt to both models. Okay, chat GPT is done. It's given a pretty detailed output. sections, layout problems. It's even giving line references like line 22, break the bullet points. That's pretty specific. Good output. Let's see what Gemini gave us. Okay, visual hierarchy, specific problems, top five changes. Honestly, it's hard to tell which one is better just by looking. Both seem detailed. Let Claude and Grock decide. So, based on Claude's evaluation, Gemini wins on specificity and actionability. and Grock Chad GPT gets 47, Gemini gets 83. Again, Gemini outperforms with a higher total score. Cool. So, clearly you know who the winner is. GPT0, Gemini 6.
12:44

Test 8: Mario Game

All right, now we're getting into the wild test. Can these models actually build something you can play? The task, build a Super Mario style platformer. floor, floating bricks, pipe to jump over, enemies walking around, flagpole at the end with a windscreen using 3JS for 3D chat GPT again couldn't give me a preview, so I tried it on LM Arena instead. And look at this. These are just HTML blocks, flat 2D. It didn't even use 3JS. For those who don't know, 3JS is basically the language for making 3D stuff on the web. This isn't using it at all. And the jump only works sometimes, like only when an enemy is close. I don't know. It's broken now. Gemini, same prompt. Okay, right away it works. Player moves, jumps. There's the pipe, bricks, flagpole, and it actually use 3. js. Physics are working. I can jump on stuff. Nice. Let me push it. Add enemies. Make the player look like an actual 8-bit character. And show me a level cleared pop-up when I win. Now we have a Mario sprite. Goombas walking around. Camera follows me. This is getting real. One more. Make it fully 3D. Game over if I hit an enemy. and make the enemies smarter. Now the blocks are actual cubes. There's depth. This looks legit. And the enemies now turn around at edges instead of walking off. That's actual AI. Let me hit an enemy. Game over. Works. Okay, let me actually try to beat it. And level cleared. Flag goes up. That's a full game from a text prompt. Yeah, I don't need to explain this one. Chat GPT couldn't show me anything. Gemini built a 3D platformer with enemy AI and wind conditions.
14:32

Test 9: Handwriting Recognition

So, I have a habit of using my iPad whenever I'm brainstorming or thinking. I write things down. But then, we're living in a digital world. We need a digital text copy of whatever we write. So, let's see which model is better at extracting text from messy handwriting. I'm taking a random badly written text image. Prompt says, "Transcribe these notes accurately. Organize into action items, ideas to explore, questions to answer. Don't skip anything. " Okay, we don't have to decide who the winner is because GPT has given up. It says the OCR didn't extract the expected text correctly. The output is unclear and not aligned with your image. Let me try again. Nope. Failed again. And Gemini already done. Transcription stat 201 lecture01 organized into action items, ideas to explore, questions to answer. It's given a good output. The least I was expecting was any output. Gemini delivered. Chad GPT 5. 2 literally gave up. Gemini did the job. GPT0 Gemini 8.
15:44

Test 10: Floor Plan Analysis

So let's come to our last test. Let's say you're building a home, your dream home. You go to a civil engineer, they give you a floor plan, but you have zero understanding of floor plans. How do you know if it's good or if it has flaws? This is where these models could help you. I have a floor plan from Google. Same image to both. Analyze it, rate the layout, suggest improvements, and reimagine a better floor plan. Okay, GPD has given the breakdown. Location, room sizes, layout efficiency. Wait, I wasn't expecting GPD to actually reimagine it, but it's doing it. Okay, it reimagined the floor plan but completely messed up the text. I won't complain about the image model. It's outdated. They're launching new ones soon. But yeah, now let's see what Gemini did. The best part is Gemini is giving the complete floor plan with actual numbers, measurements, angles, everything. It's super accurate. It's asking if I want a reimagined layout. Yeah, let's see. Now it's using Imagine to generate it. Okay, you can clearly see the difference. This is Chad GPT's output and this is Gemini's new floor plan. Really good image. Claude's verdict winner Gemini 3 Pro. Gemini demonstrated significantly deeper architectural literacy, identifying dated design trends, providing granular, actionable improvements. Chat GPT's analysis reads like a generic checklist. Geminis's reads like a consultation from someone who understands how people actually live in homes. That's a big statement. So now you have the clear winner for everything.
17:11

Final Verdict

10 tests, nine wins for Gemini. One tie where both failed, zero wins for Chad GPT. If you're doing creative work, pitch decks, landing pages, content, images, Gemini 3 Pro is ahead right now. Chad GPT still has strengths, the ecosystem, the plugins, the API integrations. But for pure output quality in these practical tests, Gemini dominated. But honestly, the real winner here is us. This competition is pushing both companies to ship faster, build better. A year ago, neither could do half of this. If you found this useful, subscribe. I'll be doing more of these comparisons as new models drop. And let me know in the comments which model are you using

Ещё от Vaibhav Sisinty

Ctrl+V

Экстракт Знаний в Telegram

Транскрипты, идеи, методички — всё самое полезное из лучших YouTube-каналов.

Подписаться