# Google’s Video AI: Outrageously Good! 🤖

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=YxmAQiiHOkA
- **Дата:** 29.10.2022
- **Длительность:** 9:04
- **Просмотры:** 385,007
- **Источник:** https://ekstraktznaniy.ru/video/13403

## Описание

❤️ Check out Runway and try it for free here: https://runwayml.com/papers/
Use the code TWOMINUTE at checkout to get 10% off!

📝 The paper "High Definition Video Generation with Diffusion Models" is available here:
https://imagen.research.google/video/

📝 My paper "The flow from simulation to reality" with is available here for free:
- Free version: https://rdcu.be/cWPfD
- Orig. Nature link - https://www.nature.com/articles/s41567-022-01788-5 

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Jonas, Jonathan, Kenneth Davis, Klaus Busse, Kyle Davis, Lorin Atzberger, Lukas Biewald, Luke Dominique Warner, Matthew Allen Fisher, Matthew Valle, Michael Albrecht, Michael Tedder, Nevin Spoljaric, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Rajars

## Транскрипт

### Teaser []

Dear Fellow Scholars, this is Two Minute  Papers with Dr. Károly Zsolnai-Fehér. I cannot believe that this paper is here. This  is unbelievable. So, what is going on here? Yes,   that’s right, we know that these modern  AI programs can paint images for us,

### Text to image [0:15]

anything we wish, but today, we are going to find  out whether they can also do it with video. You   see an example here, and here. Are these also  made by an AI? Well, I’ll tell you in a moment. So, video? That sounds impossible. That is so much  harder! You see, videos require a much greater

### Text to video? [0:37]

understanding of the world around us, so much more  computation, and my favorite, temporal coherence.    What is that? This means that a video is not just  a set of images, but a series of images that have   to relate to each other. If the AI does not do  a good job at this, we get this. Flickering.

### It is really here! [1:07]

So, as all of this is so hard, I thought, we  will be able to do this maybe in 5-10 years, or   maybe never? Well, scientists at Google say not so  fast. Now hold on to your papers, and have a look   at this. Oh my goodness! Is it really here? I am  utterly shocked, but the answer is yes. Yes it is! So now, let’s have a look at 3 of my  favorite examples, and then I’ll tell   you how much time this took. By the way,  it is an almost unfathomably short time.

### First example [1:45]

Now, one, the concept is the same: one simple text  prompt goes in, for instance, a happy elephant   wearing a birthday hat walking under the sea, and  this comes out. Wow. Look at that! That is exactly   what we were asking for in the prompt, plus, as  I am a light transport researcher by trade, I am   also looking at the waves and the sky through the  sea, which is absolutely beautiful, but it doesn’t   stop there - I also see every light transport  researcher’s dream there. Water caustics. Look   at these gorgeous patterns. Now, not even this  technique is perfect, you see that temporal   coherence is still subject to improvement, the  video still flickers a tiny bit, the tusk is also   changing over time. However, this is incredible  progress in so little time. Absolutely amazing. Two, in good Two Minute Papers fashion, now  let’s ask for a bit of physics, a bunch of

### Second example [2:48]

autumn leaves falling on a calm lake forming  the text “Imagen Video”. I love it. You see,   in computer graphics, creating a  simulation like this would take   quite a bit of 3D modeling knowledge, then,  we also have to fire up a fluid simulation. Now, this does not seem to do a great deal of  two-way coupling, which means that the water has   an effect on the leaves, you see it advecting  this leaf here, but the leaves do not seem to   have a huge effect on the water itself. This  is possible with specialized computer graphics   algorithms like this one, and I bet it will  also be possible with Imagen Video 2. Now,   I am super happy to see the reflections of  the leaves appearing on the water. Good job,   little AI! And to think that this is  just the first iteration of Imagen Video,

### Simulation or reality? [3:48]

wow. By the way, if you wish to see how  detailed a real physics simulation can be,   make sure to check out my Nature Physics comment  paper in the video description. Spoiler alert:   the surprising answer is that they can  be almost as detailed as real life. I was also very happy with this  splash. And with this turquoise   liquid’s movement in the glass too. Great  simulations on version 1. I am so happy! Now, three, give me a teddy bear  doing the dishes. Whoa! Is this real?

### Third example [4:20]

Yes it is! It really feels like we are living  inside a science fiction movie. Now, it’s not   perfect, you see that it is a little confused by  the interaction of these objects, but if someone   told me a few weeks ago that an AI would be able  to do this, I wouldn’t have believed a word of   it. It not only has a really good understanding  of reality, but it can also combine two previous   concepts, a teddy bear and washing the dishes  into something new. My goodness. I love it. Now, while we look at some more beautiful results,   we noted that this is incredible progress in  so little time. But, how little exactly? Well,

### How long did this take? [5:08]

if you have been holding on to your  papers so far, now, squeeze that paper,   because the OpenAI’s DALL-E 2 text to image AI  appeared in April 2022, then, Google’s Imagen,   also text to image appears one month later, May  2022, that is incredible, and get this, now, only   5 months later, by October 2022, we get this.   An amazing text to video AI. I am out of words!

### Failure cases [5:48]

Of course, it is not perfect, the hair  of pets is typically still a problem,   and the complexity of this ship battle is  still a little too much for it to shoulder,   so version one is not going to make  a new Pirates of The Caribbean Movie,   but maybe version 3 two more  papers down the line? Who knows?

### More beautiful examples [6:10]

Ah yes, about that. The resolution of these  videos is not too bad at all, it is in 720p,   the literature likes to call it high definition.   These are not in 4k like the shows you can watch

### Looking under the hood [6:21]

on your tv, but this quality for a first crack  the at the problem is simply stunning. And don’t   forget that first, it synthesizes a low-resolution  video, then upscales it through super resolution,   something Google is already really good at,  so I would not be surprised for version 2   to easily go to full HD, and maybe even beyond.   As you see, the pace of progress in AI research   is nothing short of amazing. If like me,  you are yearning for some more results,   you can check out the paper’s website in the video  description where as of the making of this video,

### Even more results [7:00]

you get a random selection of results. Refresh  it a couple times and see if you get something   new! And if I could somehow get access to this  technique, you bet that I’d be generating a ton   more of these. Update: I cannot make any promises,  but, good news, we are already working on it.    A video of a scholar reading exploding papers  absolutely needs to happen. Make sure to subscribe   and hit the bell icon to not miss it in case  it happens! You really don’t want to miss that. So, from now on, if you are wondering what a  wooden figurine surfing in outer space looks like,   you need to look no further. What a time  to be alive! So, what do you think? Does   this get your mind going? What would you use  this for? Let me know in the comments below! Thanks for watching and for your generous  support, and I'll see you next time!