# NVIDIA's New AI: Next Level Image Editing! 👌

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=cS4jCvzey-4
- **Дата:** 16.04.2022
- **Длительность:** 7:11
- **Просмотры:** 134,901
- **Источник:** https://ekstraktznaniy.ru/video/13595

## Описание

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.com/papers 
❤️ Their mentioned post is available here (thank you Soumik Rakshit!): http://wandb.me/EditGAN

📝 The paper "EditGAN: High-Precision Semantic Image Editing" is available here:
https://nv-tlabs.github.io/editGAN/
https://arxiv.org/abs/2111.03186
https://github.com/nv-tlabs/editGAN_release
https://nv-tlabs.github.io/editGAN/editGAN_supp_compressed.pdf

❤️ Watch these videos in early access on our Patreon page or join us here on YouTube: 
- https://www.patreon.com/TwoMinutePapers
- https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg/join

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Jonathan, Kenneth Davis, Klaus Busse, Lorin A

## Транскрипт

### Segment 1 (00:00 - 05:00) []

Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. Today we are going to do some incredible experiments with NVIDIA’s next level image editor AI. Now, there are already many AI-based techniques out there that are extremely good at creating new human, or even animal faces. You see here how beautifully this Alias-Free GAN technique can morph one result into another. It truly is a sight to behold. They are really good at editing images. You see, this was possible just a year ago. But, this new one, this can do more semantic edits to our images. Let’s dive in and see the all the amazing things that it can do! If we take a photo of our friends we haven’t seen in a while. What happens? Well of course, the eyes are closed, or are barely open. From now on, this is not a problem. Done! And, if we wish to add a smile or take it away, boom, that is also possible. Also, the universal classic, looking too much into the camera? Not a problem. Done! Now, the AI can do even more kinds of edits, hairstyle, eyebrows, wrinkles, you name it. However, that’s even not the best part. You have seen nothing yet! Are you ready for the best part? Hold on to your papers, and check this out! Yes, it even works on drawings… paintings. Oh my! And even statues as well. Absolutely amazing. How cool is that? I love it. Researchers refer to these as out of domain examples. The best part is that this is a proper learning-based method. This means that by learning on human portraits, it has obtained general knowledge. So now, it doesn’t just understand these as clumps of pixels, it now understands concepts. Thus, it can reuse its knowledge, even when facing a completely different kind of image, just like the ones you see here. This looks like science fiction, and here it is, right in front of our eyes! Wow. But, it doesn’t stop there. It can also perform semantic image editing. What is that? Well, look! We can upload an image, look at the labels of these images, and edit the labels themselves. Well, okay, but what is all that good for? Well, the AI understands how these labels correspond to the real photo. So, check this out, we do the easy part, edit the labels, and the AI does the hard part, changing the photo appropriately. Look! Yes! This is just incredible. The best part is that we are now even seeing a hint of creativity with some of these solutions. And, if you are also one of those folks who feel like the wheels and rims are never big enough, well, NVIDIA has got you covered, that’s for sure. And here comes the kicker - it learned to create these labels automatically by itself. So, how many label to image pairs did it have to look at to perform all this? Well, what you do think? Millions? Hundreds of thousands. Please leave a comment, I’d love to hear what you think. And the answer is 16. What? 16 million? Nope. 16. Two to the power of four. That’s it. Well that is one of the most jaw-dropping facts about this paper. This AI can learn general concepts from very few examples. We don’t need to label the entirety of the internet to have this technique be able to do its magic. That is absolutely incredible. Real learning from just 16 examples. Wow. The supplementary materials also showcase loads of results, so, make sure to have a look at that. If you do, you’ll find that it can even take an image of a bird, and adjust their beak sizes, even to extreme proportions. Very amusing. Or, we can even ask them to look up, and I have to say, if no one would tell me that these are synthetic images, I might not be able to tell.

### Segment 2 (05:00 - 07:00) [5:00]

Now note that some of these capabilities were present in previous techniques, but this new method does all of these with really high quality, and all this in one elegant package. What a time to be alive! Now, of course, not even this technique is perfect. There are still cases that are so far outside of the training set of the AI that the result becomes completely unusable. And this is the perfect place to invoke the first law of papers. What is that? Well, Papers says that research is a process. Do not look at where we are, will be two more papers down the line. And don’t forget, a couple papers before this we were lucky to do this. And now, see how far we’ve come. This is incredible progress in just one year. So, what do you think? What would you use this for? I’d love to hear your thoughts, please let me know in the comments below. Thanks for watching and for your generous support, and I'll see you next time!
