Экстракт Знаний

Everyone Is Wrong About Sora 2 vs Veo 3.1 (I Tested Everything)

23:20

Everyone Is Wrong About Sora 2 vs Veo 3.1 (I Tested Everything)

Vaibhav Sisinty 19.10.2025 59 741 просмотров 1 584 лайков обн. 18.02.2026

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

🔗 Join Our AI Updates WhatsApp Channel: https://whatsapp.com/channel/0029VbBALDP84OmAI59i4b0R 🔗 Subscribe to our Newsletter! Get the latest AI updates, tips, and insights straight to your inbox: https://dub.link/staying-ahead In this video, we dive deep into two revolutionary AI video models: Google Veo 3.1 and OpenAI Sora 2. Both of these platforms were released within two weeks of each other, and they are set to completely change the way we create content. We explore their capabilities, features, and official prompting guides, and end with a head-to-head showdown to help you decide which one is the best for your needs. If you're passionate about AI-generated video content and want to stay ahead of the curve, this is a must-watch! 0:00 - Intro 0:52 - Google Veo 3.1 Features 1:14 - Add Ingredients Feature 1:38 - Extend Your Scene 2:05 - First and Last Frame 2:53 - How to Access Veo 3.1 4:47 - Veo 3.1 Testing Demo 7:22 - Add Ingredients Live Test 9:02 - OpenAI Sora 2 Features 10:15 - Sora 2 Physics Test 11:04 - Cameo Feature Demo 12:03 - Dialogue & Lip Sync Test 12:42 - Official Prompting Guides 12:56 - Sora 2 Prompting (5 Steps) 14:15 - Veo 3.1 Prompting (5 Parts) 15:27 - Head-to-Head Showdown 16:00 - Test 1: Cinematic Portrait 17:24 - Test 2: Product Showcase 18:26 - Test 3: Action Sequence 19:35 - Test 4: Atmospheric Scene 21:04 - Test 5: Multi-Character 22:23 - Final Verdict 22:35 - Closing -------- To Know More, Follow Vaibhav Sisinty On ⤵︎ Instagram @VaibhavSisinty https://www.instagram.com/vaibhavsisinty Twitter @VaibhavSisinty https://twitter.com/VaibhavSisinty Facebook @VaibhavSisinty https://www.facebook.com/vaibhavsisinty/ LinkedIn - Vaibhav Sisinty https://www.linkedin.com/in/vaibhavsisinty --------

Оглавление (23 сегментов)

0:00 Intro 148 сл.
0:52 Google Veo 3.1 Features 59 сл.
1:14 Add Ingredients Feature 78 сл.
1:38 Extend Your Scene 72 сл.
2:05 First and Last Frame 142 сл.
2:53 How to Access Veo 3.1 331 сл.
4:47 Veo 3.1 Testing Demo 446 сл.
7:22 Add Ingredients Live Test 215 сл.
9:02 OpenAI Sora 2 Features 224 сл.
10:15 Sora 2 Physics Test 110 сл.
11:04 Cameo Feature Demo 136 сл.
12:03 Dialogue & Lip Sync Test 102 сл.
12:42 Official Prompting Guides 45 сл.
12:56 Sora 2 Prompting (5 Steps) 208 сл.
14:15 Veo 3.1 Prompting (5 Parts) 169 сл.
15:27 Head-to-Head Showdown 116 сл.
16:00 Test 1: Cinematic Portrait 241 сл.
17:24 Test 2: Product Showcase 126 сл.
18:26 Test 3: Action Sequence 167 сл.
19:35 Test 4: Atmospheric Scene 240 сл.
21:04 Test 5: Multi-Character 250 сл.
22:23 Final Verdict 40 сл.
22:35 Closing 145 сл.

Intro

Google and OpenAI just dropped their new AI video models within like two weeks of each other, and I've been absolutely nerding out testing both of them for the past few days. And honestly, I'm kind of freaking out because one of these is legitimately going to change how you make content. So, here's how this video is going to go. First, chapter one, everything you need to know about Google Vo 3. 1. Then, chapter two, complete deep dive on OpenAI Sora 2. After that, chapter 3, and this is the important part, the official prompting guides that both Google and OpenAI just released. Nobody's talking about these yet, and they're absolute game changers. And finally, chapter 4, the final showdown, where we test them head-to-head and figure out which one actually wins. And by the end, you'll know exactly which one you should actually be using.

Google Veo 3.1 Features

Look at this output. — The city always got a story. Just got to listen. — This output is generated by VO3. 1 videos with native audio, not just some generic background music. And I started testing it and honestly the features are kind of impressive. So let me show you what I mean. First thing I tried, add

Add Ingredients Feature

ingredients. Basically, you can upload reference images like a scene you want, a specific character, maybe an outfit, and it just combines them with audio and somehow the character stays consistent. I literally threw three completely different images at it and it worked. We're actually going to test that live in a minute so you'll see exactly what I'm talking about. But okay, so you've got your video now, right? Well, here's where things get interesting. So now

Extend Your Scene

you've got your styled video. But here's the problem I always run into. What if it's too short? That's where extend your scene comes in. You generate an 8-second clip, it's perfect, but you need more. VO just extends it. Same visual style with character and audio consistency. No weird jump cuts. Like it's literally just a seamless continuation. But wait, because this next one, this is where my mind actually broke.

First and Last Frame

First and last frame. So instead of writing out all this motion description, you just give it two images, a starting frame and an ending frame. That's it. And Veo creates the entire transition between them. Smooth, cinematic, all the in between motion figured out for you. I tested this with a transformer style transition like an object morphing into something completely different and it legitimately blew my mind. And one more thing, Google officially released the prompting guide for VO3. 1. So, you're not just getting these insane features, you're actually going to learn how to use them properly in this video. Quick thing though, I've got a weekly newsletter where I break down all the latest AI updates, the tools worth trying, and the ones you should skip. It's completely free. links in the description below. Okay, so three ways

How to Access Veo 3.1

to access VO. Let me show you the easiest one first. Open the gemini. google. com. You'll need a paid subscription. And yeah, I know another subscription, but hear me out. Free tier gets you 100 credits per month. That's about five videos. Google 1 AI premium is $20 a month. Gets you more credits. For serious use, you're looking at API access. See this? There's a video option in the prompt bar. Now you can go from photo or from text. I'm doing text because I want to control everything. Type your prompt, hit generate, and we wait. Boom. 8 second clip. It's watermarked with something called synth ID. Google's trying to be responsible about AI, which you know, good for them. Now, if you want the pro setup, and this is what I've been using. There's Flow. This is Google's actual filmmaking platform. Open the labs. google. flow. And here we are. So when you first open Flow, you'll see the prompt window. And this is where all the magic happens. See these three tabs at the top. Text to video, frame to video, ingredients to video. Text to video is your standard promptbased generation. Frame to video is where you upload a starting and ending frame. That first and last frame feature I mentioned. Ingredients to video is where you upload multiple reference images, character, scene, outfit, whatever. Right below that, you can choose your model. See this drop down there is where you can choose VO 3. 1. Fast is cheaper and quicker. Quality takes longer but the output is noticeably better. I usually test with fast then generate finals with quality over here on the right. Aspect ratio portrait or landscape. Pretty straightforward. And this is cool. Number of outputs per prompt. You can generate two or three variations at once. I usually do two because it gives me options without wasting credits. So, that's the complete flow interface. Way more control than Gemini. All right

Veo 3.1 Testing Demo

let's actually make some videos with VO. Let's start prompting. So, here's my first test, a product shot. Full prompt is on screen if you want to pause and read it. But basically, wireless headphones, studio lighting, 360 rotation. We will use VO 3. 1 fast. Hit generate and we wait. Okay, it's done. Let me show you. Look at that. The lighting is clean. The rotation is smooth. The audio actually fits. Now let's try something cinematic. Cinematic wide shot of a determined entrepreneur. Again, complete prompt is on screen. Okay, this is where VO really start to shine. Look at that golden hour lighting. It's not just orange filter slapped on the camera movement. See how it's following the subject. It's a smooth tracking shot. It feels like an actual camera operator following someone. The subject's body language reads as confident without being over the top. Now, let's jump into the scene builder. Let me add this to the scene. Okay, it's in. So, here we are in the scene builder. I'm going to clear out these other scenes first. Don't need them. Now, here's where things get interesting. There was this major issue we kept running into with VO3. Whenever you'd use the extend feature to continue a clip, you'd lose character consistency and the lip sync completely fell apart. Let's see if they've actually fixed this. All right, we need to tell it what happens next. Since I want to test the lip sync specifically, I'll write entrepreneur facing towards camera and giving a motivational speech. Let's see what we get. — Every challenge is a stepping stone. Every setback, a lesson in disguise. Embrace the journey. Trust the process. — Wait, this is crazy. Did you catch that? Look at the blend between frames. The character stays perfectly consistent. The transition is seamless. And most importantly, the lip sync and voice over are spoton. So this is where VO starts to shine. Now let's take this one level further. What if we could control exactly what our character says? Watch this. I want this entrepreneur to deliver a specific message. To stay ahead in the world of AI, follow me. Let's see what happens. Wait for it. — The process. — This is incredible. Not only is the character completely consistent across frames, but she's saying exactly what we scripted word for word. This changes everything. You know what this means? You're done with that tedious workflow where you screenshot the last frame just to keep your character looking the same. And forget about bouncing between three different platforms for lip syncing and audio. All of that gone. Now, let's test

Add Ingredients Live Test

that add ingredients feature I mentioned. All right. So in this we're using ingredients to video. I've already added all the images here. Instead of using a VO3. 1 fast, let's go with VO3. 1 quality. And let's bump the output to four variations. Let's generate. Wait, it's saying feature is not supported. Let's try quality one more time. Hm. Looks like VO 3. 1 quality doesn't actually support the ingredients to video option yet. So, let's go back to Veo 3. 1 fast and let me refine the prompt a bit before we try again. Wait, it literally switched the scene. She's supposed to be walking in as a customer, but I think they switched her role to barista or something. All right, generating. Hi, can I get a latte, please? Sure. Coming right up. Okay, this time it's different again. From where is she even coming? But this one actually nailed it. Look at the character consistency. Same person, same outfit. Scene matches the reference. The lighting is natural, her demeanor is confident, and the ambient cafe sounds tie it all together. This is what adding ingredients does when it works. You give it the pieces and it builds the scene while maintaining consistency. All right, so that's VO3. 1. Now, let's move

OpenAI Sora 2 Features

to Sora 2 and do the same thing. All right, let's actually create with Sora. I'm using the web interface for this. It's faster and easier to demonstrate. And here's the thing, I'm going to use OpenAI's official five-step structure for these prompts. We'll talk about that structure in detail in a minute, but you'll see it in action first. So, this is Sora 2. And if you come here and look at this, this is complete chaos. It's like a brain rot machine mixed with a social media feed. Just swipe up, swipe down, endless content. This is the UI. It's interesting. Now, let's actually create something. So here if you look we get the option to use Sora 2 or Sora 2 Pro. I'll mostly be using Sora 2 Pro because it gives you better output quality and importantly it doesn't slap that watermark on your videos. Then we have orientation, portrait or landscape. Resolution, high or standard. Duration, you can go up to 15 seconds max or drop it down to 10. And here's the storyboard feature which we'll test out in a minute. With storyboard, you can create multiple 15-second scenes and chain them together. Interestingly, they have a 25-second option listed here, but I don't know why it's not active. Can't select it. So, this is the setup. Let's

Sora 2 Physics Test

go with one scene for now. 15 seconds. I have my prompt ready. Hit create. Now, it's in the queue. If you're wondering where the actual processing happens, it's here in drafts. This is where you'll see your video generating. This is where Sora's physics engine should shine. Let's see. Okay, watch the ball trajectory. The arc is realistic. That's a detail most AI tools miss completely. Listen to the audio. That metallic clang when it hits the rim, that's specific to a metal rim. The backboard has that hollow thump. The floor bounces have that echo you get in empty gyms. These aren't generic sound effects. They're contextually accurate.

Cameo Feature Demo

Now, the feature everyone wants to see, cameos. Here, we're out of the storyboard. And here's the thing. In storyboard mode, there's no way to add a cameo, which is kind of limiting. So, we'll go back to the main creation screen. And if you click here, you'll see all the available cameos. These are all the people who've made their cameos public. You've got Jake Paul, Sam Alman, and I actually generated my own cameo. So, let's choose mine. Now, I'll add the prompt. Let's hit create. And now it's generating. This is legitimately impressive. I didn't shoot any of this. I recorded my face for 3 seconds, described a scene, and Sora integrated me naturally. The compositing quality is high. I don't look pasted in or separated from the environment. Last test, dialogue. This

Dialogue & Lip Sync Test

is historically where AI video tools completely fall apart. All right, I've got my prompts ready. Let's see if Sora can handle synchronized speech. — Innovation is our foundation. — This looks real. Like genuinely real. If I didn't tell you this was AI generated, you probably couldn't tell the difference. LipSync is close to accurate. Watch her eyes. There's subtle confidence there. The slight smile, that natural window light creating soft shadows on her face. Is this Hollywood film quality? Not yet. But for AI generated content with synchronized dialogue, this is the best and it's only going to get better.

Official Prompting Guides

All right, so if you're getting bad results, it's probably your prompts. Both OpenAI and Google actually dropped official prompting guides. So now I'm going to break down both so you can get amazing results no matter which tool you're using. Open eyes approach is

Sora 2 Prompting (5 Steps)

basically treating your prompt like you're briefing a cinematographer. Let's build one together as I explain this. First, set your style. What's the vibe? Like, are we going for 1970s film, modern thriller, whatever it is, start with that. So, I'm going to go with 1970s romantic movie style. Next, camera stuff. How's the camera set up? What angle? What's in focus? medium shot, eye level, shallow depth of field, so the background's all soft and blurry. Now, paint the scene. This is where you get specific about what's actually in the shot. Rainy London Street at sunset. Old brick buildings everywhere. Wet cobblestones reflecting those amber street lights. Fogs rolling through. Then lighting and colors. What kind of light are we working with? What colors are we seeing? Warm golden light coming from the left. Soft shadows. Colors: amber, charcoal, gray, deep blue. Action time. What's actually happening? Keep it clear and specific. A man in his 40s, black sweater, checkered pants, enters from the right, takes four steps, pauses, jumps over this small puddle, keeps walking. Last thing, sound. What do we hear? Background sound, distant traffic, rain hitting the pavement, faint jazz music from a bar nearby. So, here's our complete Sora 2 prompt. So

Veo 3.1 Prompting (5 Parts)

Google's got a different approach. They use this five-part formula. Let me show you how to use it. The formula is cinematography plus subject plus action plus context plus style and ambiance. Sounds fancy, but it's actually pretty simple. Let's build one. Part one, cinematography. How's the camera moving? What's the shot? Starting with crane shot. Starts low, goes way up high. Part two, subject. What are we actually looking at? Get specific. A paper boat. Like you can see the handdrawn sill markings on it. Part three. Action. What's happening? It's navigating this rainfilled gutter, tilting left and right with the current. Eventually goes into a storm drain heading toward who knows where. Part four, context. Where is this happening? What's the environment? Urban street, afternoon rainstorm, overcast, that soft gray light. Part five, style and ambiance. What's the overall feel? What do we hear? Stop motion animation style like it's handcrafted, you know. Audio rain pattering, water trickling, distant thunder. Here's our complete VO 3. 1 prompt.

Head-to-Head Showdown

prompt. Now, it's the time for the real test. I'm testing both tools with the exact same prompts, five different scenarios. Let's see who wins. For this, we will be using Invido because it's hands down the fastest way to access VO 3. 1 and Sora 2 Pro right now. So, in Nvidia, I'm going to go to agents and models. Here, you've got the option to use both VO3. 1 and Sora 2 Pro. Let me open VO 3. 1 on the left and Sora 2 Pro on the right. Let's do the sideby-side comparison. I've got the same prompt ready for both. Same settings, 8 seconds. Let's see what happens. First test cinematic portrait

Test 1: Cinematic Portrait

a CEO in a boardroom with dialogue. Hitting generate on both. Looks like Veo 3. 1 is winning here. It's way faster. Sora is pretty slow. Okay, Veo 3. 1 is already done. Sora has just it's at 50%. Let's see what Vio has generated. — Figureation is bring innovation is our foundation. — Okay, so like the initial four or 5 seconds, I can't really understand what she's saying. But after that, it's proper English and Sora 2 Pro is still at 56%. But look at the quality of this video though. It's really good. Like the background, the separation between the background and her, the overall composition, it's beautiful. Now I'm waiting for Sora 2 Pro. I'm very excited to see how this turns out. Okay, looks like Sora 2 Pro is finally done. Let's see. — Innovation is our foundation. — Wait, what? Okay, so this is more realistic. Like Veo 3. 1 looks realistic, but this is total cartoonish, very plasticky. Sora 2 Pro clearly lost it here. The only thing where Sora 2 Pro actually gave proper output was the audio. Like in audio, VO 3. 1 was saying something else up till here which I couldn't understand. While Sora 2 Pro, it's saying exactly what I wanted it to say. But overall, VO 3. 1 clearly wins. One point goes to VO3. 1. Sora 2 Pro gets zero. Now, let's go to our next test.

Test 2: Product Showcase

product showcase. Luxury perfume bottle with dramatic lighting. I have my prompts ready. Same prompt to both. Hit generate again. Looks like VO is winning on speed again. VO 3. 11. It generated the output way faster than Sora 2 Pro. Sora is still at 14%. Veo's already done. Okay, pretty good output. Okay, finally Sora 2 also generated So although Sora 2 lost on speed, it clearly won on the quality of the video. Like it's more premium. The light is out of the frame. Here in Veo, the light is in the frame. Sora's lighting is strong and balanced. And this is where Veo 3. 1's product consistency really shines. But yeah, in round two, Sora 2 is the winner. Test three, action sequence

Test 3: Action Sequence

skateboarding trick with dynamic camera work. I've got my prompts ready. Same prompt again. Hit generate. Okay, VO 3. 11 as usual on speed. Let's check out the output. Smooth camera tracking. Physics look good. Audio sync is perfect. Yeah, cool. Lighting is good, too. Now, let's wait for Sora 2 Pro. Okay. Wow. So, in Sora, the camera is smooth and the physics are exceptional. Have you seen this? It doesn't even feel like it's 3D. The physics are so accurate and the sound is super weird in a good way. I mean, it's detailed, like very detailed. Now, if you look at VO3. 1, although it gave us faster output, it's more like a slow motion shot. And the physics, it's kind of struggling a little bit, but Sora 2 Pro, it's so accurate and exceptional. It did multiple angles, multiple times. It faced the physics challenges and just nailed it. For action sequences, Sora wins. The physics engine is just better. Now, let's test

Test 4: Atmospheric Scene

the atmospheric scene. This is actually my favorite test. Rainy Tokyo Alley at night with that Bladeunner vibe. Prompts ready. Let's go again. I'm expecting Sora is going to be super slow and V3 fast. Yep. Looks like V3. 1 is winning on speed. Okay, V3. 1's output is here. Let's see how it is. Not bad. Not bad at all. The camera movement is butter smooth. The audio captured the entire soundsscape, the rain, the ambience. This is professional grade stuff. But at the same time, it doesn't look completely real. It looks kind of like it's straight out of a game or something. And here's something cool about VO3. 1 though. You give it a start frame and an end frame. And it doesn't just cross fade. It actually builds the in between motion with believable physics and camera movement. Like your storyboard literally grows legs. Now let's wait for Sora 2 Pro. Okay. Output is ready. Let's see it. Yeah. So in this one clearly when it comes to motion specifically this output V31 details and lighting though probably SA 2 Pro 1 but the thing is it's very static while if you look at the rain in VO 3. 1 it's much better the rain in SA 2 Pro feels like it feels like fake rain you know so yeah overall I'm giving this point to VO 3. 1 definitely okay let's move to the next

Test 5: Multi-Character

one multicacttor interaction last test two colleagues having a conversation in a coffee shop let me grab my prompts first all right let's Oh, generate and generate. Cool. Again, when it comes to speed, V3. 1 is just running ahead. Okay. V3. 1 is here. And then he realized the report was due yesterday. — Oh my gosh. No, he didn't. That is hilarious. — Okay, so both are to be honest, this is very real. I mean, yeah. Uh, just look at the lighting coming in. The reflection of light on their hair, on his face, the direction is correct. Both are interacting really well. Both have really good lip sync from different angles. So yeah, they're not messing up there. Overall, the frame is consistent. It's good. Yeah, I'll have to give it to Vo 3. 1. But Sora 2 Pro might surprise us. Let's wait. So here is the Sora's output. I swear the look on his face when the printer started spitting out 50 copies. — Who does that? — Oh, okay. So in VO 3. 1, the voice is very isolated. It doesn't have a lot of ambient noise. While Sora 2 Pro and not just in this output but in every output, Sora adds a lot of ambient noise. But to be honest, I think both stand out here. Both gave amazing quality. Like I'd have to say, yeah, both are consistent. So for social character focused content, both are really strong. All right

Final Verdict

moment of truth. Which one should you actually use? Use VO 3. 1 when you need speed. It's way faster. Product videos and commercial content, cinematic, moody scenes, and character consistency across shots. Use Sora 2 when you need accurate

Closing

physics for action sequences, crystal clear dialogue and lip sync, rich layered soundscapes, and that premium polished look. Look, this isn't about picking sides. It's about using the right tool for the right job. Here's the thing. 5 years ago, AI couldn't even generate a realistic image. Today, we've got VO 3. 1 and Sora 2. The barrier to entry for video content just dropped to basically zero. So, here's what I want you to do. Pick one and try it. And look, this is the speed at which the world is changing right now. To stay updated in all this chaos, join our WhatsApp community links in the description and subscribe so you don't miss what's dropping next. Drop a comment below. Are you team Vio or team Sora? I'm genuinely curious what you guys think. All right, I'll see you in the next

Ещё от Vaibhav Sisinty

Ankur@warikoo Untold Podcast: From Employee to Founder to Creator

Vaibhav Sisinty | 07.10.2023 | 34 сегм. | 875 628

Google's SECRET 7 AI Tools Just DESTROYED ChatGPT (100% FREE Stack)

Vaibhav Sisinty | 12.11.2025 | 20 сегм. | 361 978

Google's FREE Tool Just DESTROYED 10 AI Subscriptions (NotebookLM Mastery)

Vaibhav Sisinty | 23.12.2025 | 13 сегм. | 261 010

⁠Podcast with @Sahil_Bloom on Content Creation, Side Hustles & Fitness | GSTH 04

Vaibhav Sisinty | 20.04.2024 | 20 сегм. | 218 777

This Viral AI Bot Can do your Work For FREE | Here's How to Set It Up

Vaibhav Sisinty | 29.01.2026 | 15 сегм. | 185 947

Master ChatGPT Agent Builder Before It's Too Late: Dev Day Breakdown + Full Tutorial

Vaibhav Sisinty | 08.10.2025 | 21 сегм. | 163 843

Ctrl+V

Экстракт Знаний в Telegram

Транскрипты, идеи, методички — всё самое полезное из лучших YouTube-каналов.

Подписаться