Latent Space Human Face Synthesis | Two Minute Papers #191
4:37

Latent Space Human Face Synthesis | Two Minute Papers #191

Two Minute Papers 24.09.2017 35 388 просмотров 1 169 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
The paper "Optimizing the Latent Space of Generative Networks" is available here: https://arxiv.org/pdf/1707.05776.pdf Khan Academy's video on the Nash equilibrium: https://www.khanacademy.org/economics-finance-domain/microeconomics/nash-equilibrium-tutorial/nash-eq-tutorial/v/prisoners-dilemma-and-nash-equilibrium Earlier episodes showcased in the video: Image Editing with Generative Adversarial Networks - https://www.youtube.com/watch?v=pqkpIfu36Os AI Learns to Synthesize Pictures of Animals - https://www.youtube.com/watch?v=D4C1dB9UheQ AI Makes 3D Models From Photos - https://www.youtube.com/watch?v=HO1LYJb818Q Font paper: http://vecg.cs.ucl.ac.uk/Projects/projects_fonts/projects_fonts.html We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Andrew Melnychuk, Brian Gilman, Dave Rushton-Smith, Dennis Abts, Esa Turkulainen, Evan Breznyik, Kaben Gabriel Nanlohy, Michael Albrecht, Michael Jensen, Michael Orenstein, Steef, Sunil Kim, Torsten Reil. https://www.patreon.com/TwoMinutePapers Two Minute Papers Merch: US: http://twominutepapers.com/ EU/Worldwide: https://shop.spreadshirt.net/TwoMinutePapers/ Music: Antarctica by Audionautix is licensed under a Creative Commons Attribution license (https://creativecommons.org/licenses/by/4.0/) Artist: http://audionautix.com/ Thumbnail background image credit: https://pixabay.com/photo-2589641/ Splash screen/thumbnail design: Felícia Fehér - http://felicia.hu Károly Zsolnai-Fehér's links: Facebook: https://www.facebook.com/TwoMinutePapers/ Twitter: https://twitter.com/karoly_zsolnai Web: https://cg.tuwien.ac.at/~zsolnai/

Оглавление (1 сегментов)

Segment 1 (00:00 - 04:00)

Dear Fellow Scholars, this is Two Minute Papers with Károly Zsolnai-Fehér. In many previous episodes, we talked about generative adversarial networks, a recent new line in machine learning research with some absolutely fantastic results in a variety of areas. They can synthesize new images of animals, create 3d models from photos, or dream up new products based on our edits of an image. A generative adversarial network means that we have two neural networks battling each other in an arms race. The generator network tries to create more and more realistic images, and these are passed to the discriminator network which tries to learn the difference between real photographs and fake, forged images. During this process, the two neural networks learn and improve together until they become experts at their own craft. And as you can see, the results are fantastic. However, training these networks against each other is anything but roses and sunshine. We don't know if the process converges or if we reach Nash equilibrium. Nash equilibrium is a state where both actors believe they have found an optimal strategy while taking into account the other actor's possible decisions, and neither of them have interest in changing their strategy. This is a classical scenario in game theory where two convicted criminals are pondering whether they should snitch on each other without knowing how the other decided to act. If you wish to hear more about the Nash-equilibrium, I've a put a link to Khan Academy's video in the description, make sure to check it out, you'll love it! I find it highly exciting that there are parallels in AI and game theory, however, the even cooler thing is that here, we try to build a system where we don't have to deal with such a situation. This is called Generative Latent Optimization, GLO in short and it is about introducing tricks to do this by only using a generator network. If you have ever read up on font design, you know that it is a highly complex field. However, if we'd like to create a new font type, what we're typically interested in is only a few features, like how curvy they are, or whether we're dealing with a serif kind of font, and simple descriptions like that. The same principle can be applied to human faces, animals, and most topics you can imagine. This means that there are many complex concepts that contain a ton of information, most of which can be captured by a simple description with only a few features. This is done by projecting this high-dimensional data onto a low-dimensional latent space. This latent space helps eliminating adversarial optimization, which makes this system much easier to train, and the main selling point is that it still retains the attractive properties of generative adversarial networks. This means that it can synthesize new samples from the learned dataset. If it had learned the concept of birds, it will be able to synthesize new bird species. It can perform continuous interpolation between data points. This means that for instance, we can produce intermediate states between two chosen furniture types or light fixtures. It is also able to perform simple arithmetic operations between any number of data points. For instance, if A is males with sunglasses, B are males without sunglasses, and C are females, then A-B+C is going to generate females in sunglasses. It can also do super resolution and much, much more, make sure to have a look at the paper in the video description. Now, before we go, we shall address the elephant in the room: these images are tiny. Our seasoned Fellow Scholars know that for generative adversarial networks, there are plenty of works on how to synthesize high resolution images with more details. This means that this is a piece of work that opens up exciting new horizons, but it is not to be measured against the tenth followup work on top of a more established line of research. Two Minute Papers will be here for you to keep you updated on the progress, which is, as we know, staggeringly quick in machine learning research. Don't forget to subscribe and click the bell icon to never miss an episode. Thanks for watching and for your generous support, and I'll see you next time!

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник