# 1,000,000,000 Parameter Super Resolution AI!

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=UyoXmHS-KGc
- **Дата:** 30.08.2023
- **Длительность:** 4:54
- **Просмотры:** 142,054

## Описание

❤️ Check out Weights & Biases and say hi in their community forum here: https://wandb.me/paperforum

📝 The paper "GigaGAN: Scaling up GANs for Text-to-Image Synthesis" is available here:
https://mingukkang.github.io/GigaGAN/

My latest paper on simulations that look almost like reality is available for free here:
https://rdcu.be/cWPfD 

Or this is the orig. Nature Physics link with clickable citations:
https://www.nature.com/articles/s41567-022-01788-5

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bret Brizzee, Bryan Learn, B Shang, Christian Ahlin, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Kenneth Davis, Klaus Busse, Kyle Davis, Lukas Biewald, Martin, Matthew Valle, Michael Albrecht, Michael Tedder, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Rajarshi Nigam, Ramsey Elbasheer, Richard Sundvall, Steef, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers

Thumbnail background design: Felícia Zsolnai-Fehér - http://felicia.hu

Károly Zsolnai-Fehér's links:
Twitter: https://twitter.com/twominutepapers
Web: https://cg.tuwien.ac.at/~zsolnai/

## Содержание

### [0:00](https://www.youtube.com/watch?v=UyoXmHS-KGc) Introduction

The first is cool, the second is great, and the third one is simply incredible to the point that I couldn’t believe the results and had to look over and over again. And the fourth is a thing of beauty. So, what are the tricks? First, this work is called GigaGAN, and it can perform text to image.

### [0:27](https://www.youtube.com/watch?v=UyoXmHS-KGc&t=27s) What are the tricks

We Fellow Scholars have seen this before many-many times, you enter a text prompt and it paints you an image. What is great about it is that it can give us reasonably high quality images, that is okay, but here is the kicker: it can perform this in a fraction of a second. That is extremely quick. For instance, a previous, StyleGAN-based method that could be roughly as fast, we needed to make significant concessions in terms of quality. Not anymore. Loving it. Two, it is not only fast, but it so fast that it can create several images per second, and thus it offers a controllable latent space. This is a hallmark for GAN-based methods and leads to incredible artistic controllability as you see here. But, what does all this that mean? A GAN-based method means a generative adversarial network where to neural networks battle each other. One tries to generate new images to fool the other one, while this other one trains to

### [1:32](https://www.youtube.com/watch?v=UyoXmHS-KGc&t=92s) What does this mean

be able to spot these synthetic images. Over time, they battle each other, and they improve together. Now, third, and this is where things get out of hand. Super resolution, or in other words, image upscaling. Here, a coarse image goes in, and the AI guesses what this image could be, and synthesizes a new, really detailed image. Now hold on to your papers, and look at that!

### [2:07](https://www.youtube.com/watch?v=UyoXmHS-KGc&t=127s) Super Resolution

The difference can be really huge: up to a 1000 times more pixels in the new image. So how does it compare to previous methods, for instance, the amazing Stable Diffusion? Oh my, look at that. It is so much better across the board, pretty much everywhere. But the eyes, the eyes are truly something else with the new technique. What a time to be alive! And finally, fourth. It also offers a disentangled latent space for controllability. Now, I hear you asking, Doctor, what does that mean? This means that we can control the style in place with text prompts.

### [2:49](https://www.youtube.com/watch?v=UyoXmHS-KGc&t=169s) Style

So whenever we get a teddy bear that we really like, but we would really like to see it crocheted, a bit fluffier, or made of denim, we can do that, but, without generating a new teddy bear. All the changes are applied to the same subject. That is incredible. I also loved the ball example here, this almost looks like an image straight out of a material modeling paper in computer graphics. What I absolutely love about this paper is that normally, for these four problems, we

### [3:24](https://www.youtube.com/watch?v=UyoXmHS-KGc&t=204s) Conclusion

would need four separate tools. Not anymore! Today we are getting all of these amazing capabilities with just one tool that can do all of them, and not only that, but it is so much faster than previous methods while it is still competitive in terms of visual quality. It is not the best and the fastest at the same time, not even close, but this tradeoff is I think an amazing value proposition. If you enjoyed this paper, make sure to subscribe and hit the bell icon to not miss out, we

### [4:01](https://www.youtube.com/watch?v=UyoXmHS-KGc&t=241s) Outro

have some more amazing papers coming up soon. Thanks for watching and for your generous support, and I'll see you next time!

---
*Источник: https://ekstraktznaniy.ru/video/13053*