# Google’s New AI Watched 30,000,000 Videos!

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=_RSoTpAeiMM
- **Дата:** 08.02.2024
- **Длительность:** 7:51
- **Просмотры:** 74,377

## Описание

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers

📝 The paper "LUMIERE: A Space-Time Diffusion Model for Video Generation" is available here:
https://lumiere-video.github.io/

📝 My latest paper on simulations that look almost like reality is available for free here:
https://rdcu.be/cWPfD 

Or this is the orig. Nature Physics link with clickable citations:
https://www.nature.com/articles/s41567-022-01788-5

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Brizzee, Gaston Ingaramo, Gordon Child, Jace O'Brien, John Le, Kyle Davis, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness, Richard Putra Iskandar, Richard Sundvall, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers

Thumbnail background design: Felícia Zsolnai-Fehér - http://felicia.hu
Károly Zsolnai-Fehér's research works: https://cg.tuwien.ac.at/~zsolnai/
Twitter: https://twitter.com/twominutepapers

## Содержание

### [0:00](https://www.youtube.com/watch?v=_RSoTpAeiMM) Segment 1 (00:00 - 05:00)

this is unbelievable I just talked about a Google AI research paper that takes text and creates great videos out of it I barely finished talking about it and what do I see this WOW a better version is already out there the pace of progress in AI research is so incredibly quick unbelievable this one has looked at 30 million videos and can create 1 megapixel videos a million pixels for up to 5 seconds so what can we do with those 1 million pixels well we can do six things amazingly well with it dear fellow scholar this is two minute papers with Dr car one text to video whether you want a surfing teddy bear delicious sushi or muffins or if you're wondering what a keyboard made for mouses would look like this works really well and I love that it even has a hint of creativity two image to video it can do absolute Miracles like taking the legendary Girl With a Pearl Earring and animated I wonder what it would look like if we made her smile well let's find out together look at that so good and I absolutely loved the variety of things that this can generate it feels like absolutely any topic works even ones again that need a hint of creativity in a machine how cool is that what a time to be alive now three hold on to your papers fellow scholar for stylized generation this will be mildly reminiscent of style transfer but a great deal more powerful what is style transfer an incredible Tech technique from the early days of 2minute papers where you could add one image for style and one for content and it would fuse the two together it was fantastic especially that it can also be done in an illumination guided manner something that we showcased 700 episodes ago or over time researchers extended the idea for doing style transfer to even simulations that is incredible but you know what's better getting just one image for style and a text prompt of what we would like to see and out comes the video with that style that is insanity for instance look at that bear dancing this is me when I get my hands on a fresh paper like this loving it and now four we can also stylize or edit videos oh yeah look that's me again so here the input would be a video of yours plus a prompt that describes the instructions as to how to recreate the video just imagine the creative possibilities it could unlock you can try to run dance or do anything and ask it to take your motion and make a teddy bear or a robot do that and this is really just some surface level stuff you can even rebuild yourself from origami toy brick you name it five cinemagraphs what are those well you choose a region in an image and it will be animated while the rest of the image stays as it is six in painting means that we have a little Gap in an image and we look to fill it with sensible information this is super difficult here you see patch match an amazing handcrafted computer Graphics algorithm that could do it and it was powered by pure human Ingenuity but today with the power of these learning algorithms we can even do this for video now here is the not that cool use case you have somehow lost part of the video and would like to restore it that also works but the cool use case is where you have the full video you just want some part of it changed and recreated if you want to add some new ingredients or create a completely different cake and add the chocolate sauce to that not a problem can you find out that this is not the original video but synthesized by an AI let me know in the comments below so how does this Wizardry work well it generates all these videos at an initial resolution of 128 by 128 and then upscales them to 1,000 24 by

### [5:00](https://www.youtube.com/watch?v=_RSoTpAeiMM&t=300s) Segment 2 (05:00 - 07:00)

1,24 it is like an artist who initially uses a pencil to create the outlines first and only then finish the artwork with the fine brush strokes but wait there are so many similar techniques out there dozens at this point is this really better than them well we can test that by calling in a bunch of humans have them look at the text to video result of this technique versus the previous techniques of course in a randomly ordered manner so we don't know which side is which and if we do that what do we get oh my goodness fellow Scholars we get a clean sweep this new one is preferred by the people for every single case in aggregate one of the closer competitors is image and video something that we talked about here earlier that was fantastic fastic incidentally also a paper from scientists at Google Bravo I will note comparisons with numbers these look a little more competitive and a bit less one-sided now let's pop the hood and have a look in there oh yes the new technique also uses multi- diffusion this reduces the amount of sudden jumps during the creation of these videos this is like a special kind of glue that makes them stick together better that creates scenes with no awkward jumps and what I like here is that the new technique also has fewer moving Parts than previous methods loving it I believe this new paper is going to be an amazing asset for all of us to be able to unleash our creativity and I can't wait to see where you fellow scholar are going to take it once the first public implementations of this paper appear if you're looking for inexpensive Cloud gpus for AI Lambda now offers the best prices in the world for GPU Cloud compute no commitments or negotiation required just sign up and launch an instance and hold on to your papers because with the Lambda GPU Cloud you can now get on demand h100 instances and they are one of the first Cloud providers to offer publicly available on demand h100 X did I mention they also offer persistent storage so join researchers at organizations like apple MIT and ctech in using Lambda Cloud instances workstations or servers make sure to go to Lambda labs. com slapers to sign up for one of their amazing GPU instances today

---
*Источник: https://ekstraktznaniy.ru/video/12734*