3 New Things An AI Can Do With Your Photos!
7:14

3 New Things An AI Can Do With Your Photos!

Two Minute Papers 13.03.2021 136 235 просмотров 10 672 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
❤️ Check out Weights & Biases and sign up for a free demo here: https://www.wandb.com/papers ❤️ Their mentioned post is available here: https://wandb.ai/mathisfederico/wandb_features/reports/Visualizing-Confusion-Matrices-With-W-B--VmlldzoxMzE5ODk 📝 The paper "GANSpace: Discovering Interpretable GAN Controls" is available here: https://github.com/harskish/ganspace 📝 Our material synthesis paper is available here: https://users.cg.tuwien.ac.at/zsolnai/gfx/gaussian-material-synthesis/ 📝 The font manifold paper is available here: http://vecg.cs.ucl.ac.uk/Projects/projects_fonts/projects_fonts.html 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Haro, Alex Serban, Alex Paden, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bruno Mikuš, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Haris Husic, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Joshua Goller, Kenneth Davis, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Ramsey Elbasheer, Robin Graham, Steef, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh. If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers Meet and discuss your ideas with other Fellow Scholars on the Two Minute Papers Discord: https://discordapp.com/invite/hbcTJu2 Thumbnail background image credit: https://pixabay.com/images/id-5330343/ Károly Zsolnai-Fehér's links: Instagram: https://www.instagram.com/twominutepapers/ Twitter: https://twitter.com/twominutepapers Web: https://cg.tuwien.ac.at/~zsolnai/

Оглавление (4 сегментов)

<Untitled Chapter 1>

Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. Here you see people that don’t exist. How can that be? Well, they don’t exist because these images were created with a neural network-based learning method by the name StyleGAN2, which can not only create eye-poppingly detailed looking images, but it can also fuse these people together, or generate cars, churches, horses, and of course, cats. The even cooler thing is that many of these techniques allow us to exert artistic control over these images. So how does that happen? How do we control a neural network? It happens through exploring latent spaces. And what is that? A latent space is a made-up place where we are trying to organize data in a way that similar things are close to each other. What you see here is a 2D latent space for generating different fonts. It is hard to explain why these fonts are similar, but most of us would agree that they indeed share some common properties. The cool thing here is that we can explore this latent space with our cursor, and generate all kinds of new fonts. You can try this work in your browser, the link is available in the video description. And, luckily, we can build a latent space, not only for fonts, but, for nearly anything. I am a light transport researcher by trade, so in this earlier paper, we were interested in generating somewhat hundreds of variants of a material model to populate this scene. In this latent space, we can concoct all of these really cool digital material models. A link to this work is also available in the video description. So let’s recap - one of the cool things we can do with latent spaces is generate new images that are somewhat similar. But there is a problem. As we go into nearly any direction, not just one thing, but many things about the image change. For instance, as we explore the space of fonts here, not just the width of the font changes, everything changes. Or if we explore materials here, not just the shininess or the colors of the material change, everything changes. This is great to explore if we can do it in real time. If I change this parameter, not just the car shape changes, the foreground changes, the background changes…again, everything changes! So, these are nice and intuitive controls, but not interpretable controls. Can we get that somehow? The answer is, yes, not everything must change, this previous technique is based on StyleGAN2 and is called StyleFlow, and it can take an input photo of a test subject, and edit a number of meaningful parameters. Age, expression, lighting, pose, you name it. For instance, it could also grew Elon musk a majestic beard. And that’s not all, because Elon Musk is not the only person who got a beard. Look, this is me here, after I got locked up for dropping my papers. And I spent so long in there, that I grew a beard. Or I mean, this neural network gave me one. And since the punishment for dropping your papers is not short…in fact, it is quite long…this happened. Ouch. I hereby promise to never drop my papers, ever again. You will also have to hold on to yours too, so stay alert. So, apparently interpretable controls already exist. And I wonder, how far can we push this concept? Beard or no beard is great, but what about cars, what about paintings? Well, this new technique found a way to navigate these latent spaces and introduces 3 amazing new examples of interpretable controls that I haven’t seen anywhere else yet. One, it can change the car geometry. We can change the sportiness of a car, and even ask the design to be more or less boxy.

Car Geometry

Note that there is some additional damage here, but we can counteract that by changing the foreground to our taste, for instance, add some grass in there. Two, it can repaint paintings.

Repaint Paintings

We can change the roughness of the brush strokes, simplify the style or even rotate the model. This way, we can create or adjust a painting without having to even touch a paintbrush. Three, facial expressions.

Racial Expressions

First, when I started reading this paper, I was a little suspicious. I have seen these controls before so I looked at it like this, but as I saw how well it did, I went more… like this. And this paper can do way more, for instance, it can add lipstick, change the shape of the mouth or the eyes, and do all this with very little collateral damage to the remainder of the image. Loving it. It can also find and blur the background similarly to those amazing portrait mode photos that newer smartphones can do. And, of course, it can also do the usual suspects. Adjusting the age, hairstyle, or growing a beard. So with that, there we go, now, with the power of neural network-based learning methods, we can create new car designs, can repaint paintings without ever touching a paintbrush, and give someone a shave. It truly feels like we are living in a science fiction world. What a time to be alive! Thanks for watching and for your generous support, and I'll see you next time!

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник