❤️ Check out Weights & Biases and sign up for a free demo here: https://www.wandb.com/papers
❤️ Their report for this paper is available here: https://wandb.ai/wandb/in-domain-gan/reports/In-Domain-GAN-Inversion--VmlldzoyODE5Mzk
📝 The paper "In-Domain GAN Inversion for Real Image Editing" is available here:
https://genforce.github.io/idinvert/
Check out the research group's other works, there is lots of cool stuff there:
https://genforce.github.io/
🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Aleksandr Mashrabov, Alex Haro, Alex Paden, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bruno Mikuš, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Lau, Eric Martel, Gordon Child, Haris Husic, Javier Bustamante, Joshua Goller, Lorin Atzberger, Lukas Biewald, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Ramsey Elbasheer, Robin Graham, Steef, Sunil Kim, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh.
If you wish to support the series, click here: https://www.patreon.com/TwoMinutePapers
Károly Zsolnai-Fehér's links:
Instagram: https://www.instagram.com/twominutepapers/
Twitter: https://twitter.com/twominutepapers
Web: https://cg.tuwien.ac.at/~zsolnai/
#deaging
Оглавление (2 сегментов)
Segment 1 (00:00 - 05:00)
Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. Today, we are living the advent of neural network-based image generation algorithms. What you see here is some super high quality results from a technique developed by scientists at NVIDIA called StyleGAN2. Right, all of these were generated by a learning algorithm. And, while generating images of this quality is a great achievement, but if we have an artistic vision, we wonder, can we bend these images to our will? Can we control them? Well, kind of, and one of the methods that enables us to do that is called image interpolation. This means that we have a reference image for style, and a target image, and with this, we can morph one human face into another. This is sufficient for some use cases, however, if we are looking for more elaborate edits, we hit a wall. Now it’s good that we already know what StyleGAN is, because this new work builds on top of that, and shows exceptional image editing and interpolation abilities. Let’s start with the image editing part! With this new work, we can give anyone glasses, and a smile, or even better, transform them into a variant of the Mona Lisa. Beautiful. The authors of the paper call this process semantic diffusion. Now, let’s have a closer look at the expression and pose change possibilities. I really like that we have fine-grained control over these parameters, and what’s even better, we don’t just have a start and endpoint, but all the intermediate images make sense, and can stand on their own. This is great for pose and expression because we can control how big of a smile we are looking for, or even better, we can adjust the age of the test subject with remarkable granularity. Let’s go all out! I like how Mr. Cumberbatch looks nearly the same as a baby, we might have a new mathematical definition for baby face right there, and apparently Mr. DiCaprio scores a bit lower on that. And I would say that both results are quite credible! Very cool! And now, onto image interpolation. What does this new work bring to the table in this area? Previous techniques are also pretty good at morphing…until we take a closer look at them. Let’s continue our journey with three interpolation examples, with increasing difficulty. Let’s see the easy one first. I was looking for morphing example with long hair, you will see why right away. This is how the older method did. Uh-oh. One more time. Do you see what I see? If I stop the process here, you see that this is an intermediate image that doesn’t make sense. The hair over the forehead just suddenly vanishes into the ether. Let’s see how the new method deals with this issue! Wow, much cleaner, and I can stop nearly anywhere and leave the process with a usable image. Easy example, checkmark. Now let’s see an intermediate-level example! Let’s go from an old black and white Einstein photo to a recent picture with colors and stop the process at different points, and…yes, I prefer the picture created with the new technique close to every single time! Do you agree? Let me know in the comments below! Intermediate example, checkmark. And now, onwards to the let’s hardest, nastiest example. This is going to sound impossible, but we are going to transform the Eiffel Tower into the Tower Bridge. Yes, that sounds pretty much impossible. So let’s see how a conventional interpolation technique did here. Well…that’s not good. I would argue that nearly none of the images showcased here would be believable if we stopped the process and took them out. And let’s see the new method. Hmm, that makes sense, we start with one tower, then, two towers grow from the ground, and…look! Wow! The bridge slowly appears between them. That was incredible. While we look at some more results, what really happened here? At the risk of simplifying the contribution of this new paper, we can say that during interpolation, it ensures that we remain within the same domain for the intermediate images. Intuitively, as a result, we get less nonsense in the outputs, and can pull off morphing not only between human faces, but even go from a black and white photo to a colored one, and what’s more, it can even deal with completely different building types. Or, you know, just transform people into Mona Lisa variants.
Segment 2 (05:00 - 06:00)
Absolutely amazing. What a time to be alive! Thanks for watching and for your generous support, and I'll see you next time!