# NVIDIA’s AI Transformed My Chihuahua Into a Lion

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=SfvRhqsmU4o
- **Дата:** 01.06.2019
- **Длительность:** 4:42
- **Просмотры:** 62,081

## Описание

Check out Lambda Labs here: https://lambdalabs.com/papers

📝 The paper "Few-Shot Unsupervised Image-to-Image Translation" and its demo is available here:
https://nvlabs.github.io/FUNIT/
https://nvlabs.github.io/FUNIT/petswap.html

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
313V, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Anthony Vdovitchenko, Brian Gilman, Bruno Brito, Bryan Learn, Christian Ahlin, Christoph Jadanowski, Claudio Fernandes, Daniel Hasegan, Dennis Abts, Eric Haddad, Eric Martel, Evan Breznyik, Geronimo Moralez, Ivelin Ivanov, James Watt, Javier Bustamante, John De Witt, Kaiesh Vohra, Kasia Hayden, Kjartan Olason, Levente Szabo, Lorin Atzberger, Marcin Dukaczewski, Marten Rauschenberg, Maurits van Mastrigt, Michael Albrecht, Michael Jensen, Nader Shakerin, Owen Campbell-Moore, Owen Skarpness, Raul Araújo da Silva, Richard Reis, Rob Rowe, Robin Graham, Ryan Monsurate, Shawn Azman, Steef, Steve Messina, Sunil Kim, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Zach Boldyga.
https://www.patreon.com/TwoMinutePapers

Splash screen/thumbnail design: Felícia Fehér - http://felicia.hu

Károly Zsolnai-Fehér's links:
Facebook: https://www.facebook.com/TwoMinutePapers/
Twitter: https://twitter.com/karoly_zsolnai
Web: https://cg.tuwien.ac.at/~zsolnai/

#NVIDIA #Funit

## Содержание

### [0:00](https://www.youtube.com/watch?v=SfvRhqsmU4o) Segment 1 (00:00 - 04:00)

This episode has been supported by Lambda Labs. Dear Fellow Scholars, this is Two Minute Papers with Károly Zsolnai-Fehér. Let’s talk about a great recent development in image translation! Image translation means that some image goes in, and it is translated into an analogous image of a different class. A good example of this would be when we have a standing tiger as an input, and we ask the algorithm to translate this image into the same tiger lying down. This leads to many amazing applications, for instance, we can specify a daytime image and get the same scene during nighttime. We can go from maps to satellite images, from video games to reality and more. However, much like many learning algorithms today, most of these techniques have a key limitation - they need a lot of training data, or in other words, these neural networks require seeing a ton of images in all of these classes before they can learn to meaningfully translate between them. This is clearly inferior to how humans think, right? If I would show you a horse, you could easily imagine, and some of you could even draw what it would look like if it were a zebra instead. As I am sure you have noticed by reading arguments on many internet forums, humans are pretty good at generalization. So, how could we possibly develop a learning technique that can look at very few images, and obtain knowledge from them that generalizes well? Have a look at this crazy new paper from scientists at NVIDIA that accomplishes exactly that. In this example, they show an input image of a golden retriever, and then, we specify the target classes by showing them a bunch of different animal breeds, and…look! In goes your golden, and out comes a pug or any other dog breed you can think of... And now, hold on to your papers, because this AI doesn’t have access to these target images and it had only seen them the very first time as we just gave it to them. It can do this translation with previously unseen object classes. How is this insanity even possible? This work contains a generative adversarial network, which assumes that the training set we give it contains images of different animals, and what it does during training is practicing the translation process between these animals. It also contains a class encoder that creates a low-dimensional latent space for each of these classes, which means that it tries compress these images down to a few features that contain the essence of these individual dog breeds. Apparently it can learn the essence of these classes really well because it was able to convert our image into a pug without ever seeing a pug other than this one target image. As you can see here, it comes out way ahead of previous techniques, but of course, if we give it a target image that is dramatically different than anything the AI has seen before, it may falter. Luckily, you can even try it yourself through this web demo which works on pets, so make sure to read the instructions carefully, and, let the experiments begin! In fact, due to popular request, let me kick this off with Lisa, my favorite chihuahua. I got many tempting alternatives, but worry not, in reality, she will stay as is. I was also curious about trying a non-traditional head position, and as you see with the results, this was a much more challenging case for the AI. The paper also discusses this limitation in more detail. You know the saying, two more papers down the line, and I am sure this will also be remedied. I am hoping that you will also try your own pets and as a Fellow Scholar, you will flood the comments section here with your findings. Strictly for science, of course. If you’re doing deep learning, make sure to look into Lambda GPU systems. Lambda offers workstations, servers, laptops, and a GPU cloud for deep learning. You can save up to 90% over AWS, GCP, and Azure GPU instances. Every Lambda GPU system is pre-installed with TensorFlow, PyTorch, and Keras. Just plug it in and start training. Lambda customers include Apple, Microsoft, and Stanford. Go to lambdalabs. com/papers or click the link in the description to learn more. Big thanks to Lambda for supporting Two Minute Papers and helping us make better videos. Thanks for watching and for your generous support, and I'll see you next time!

---
*Источник: https://ekstraktznaniy.ru/video/14305*