This AI Hallucinates Images For You

3:41

This AI Hallucinates Images For You

Two Minute Papers 03.09.2019 53 465 просмотров 2 148 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

📷 We are now available on Instagram: https://www.instagram.com/twominutepapers/ 📝 The paper "On the steerability of generative adversarial networks" is available here: https://ali-design.github.io/gan_steerability/ The paper "Learning a Manifold of Fonts" and its demo are available here: http://vecg.cs.ucl.ac.uk/Projects/projects_fonts/projects_fonts.html Our material synthesis paper is available here: https://users.cg.tuwien.ac.at/zsolnai/gfx/gaussian-material-synthesis/ ❤️ Pick up cool perks on our Patreon page: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Anthony Vdovitchenko, Brian Gilman, Bruno Brito, Bryan Learn, Christian Ahlin, Christoph Jadanowski, Claudio Fernandes, Daniel Hasegan, Dennis Abts, Eric Haddad, Eric Martel, Evan Breznyik, Geronimo Moralez, James Watt, Javier Bustamante, John De Witt, Kaiesh Vohra, Kasia Hayden, Kjartan Olason, Levente Szabo, Lorin Atzberger, Lukas Biewald, Marcin Dukaczewski, Marten Rauschenberg, Matthias Jost, Maurits van Mastrigt, Michael Albrecht, Michael Jensen, Nader Shakerin, Owen Campbell-Moore, Owen Skarpness, Raul Araújo da Silva, Rob Rowe, Robin Graham, Ryan Monsurate, Shawn Azman, Steef, Steve Messina, Sunil Kim, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil. https://www.patreon.com/TwoMinutePapers Splash screen/thumbnail design: Felícia Fehér - http://felicia.hu Károly Zsolnai-Fehér's links: Instagram: https://www.instagram.com/twominutepapers/ Twitter: https://twitter.com/karoly_zsolnai Web: https://cg.tuwien.ac.at/~zsolnai/

Оглавление (1 сегментов)

Segment 1 (00:00 - 03:00)

Dear Fellow Scholars, this is Two Minute Papers with Károly Zsolnai-Fehér. As machine learning research advances over time, learning-based techniques are getting better and better at generating images, or even creating videos when given a topic. A few episodes ago, we talked about a DeepMind’s Dual Video Discriminator technique, in which, multiple neural networks compete against each other, teaching our machines to synthesize a collection of 2-second long videos. One of the key advantages of this method was that it also learned the concept of changes in the camera view, zooming in on an object, and understood that if someone draws something with a pen, the ink has to remain on the paper unchanged. However, generally, if we wish to ask an AI to synthesize assets for us, in many cases, we’ll likely have an exact idea of what we are looking for. In these cases, we are looking for a little more artistic control than this technique offers us. So, can we get around this? If so, how? Well, we can! I’ll tell you how in a moment, but to understand this solution, we first have to have a firm grasp on the concept of latent spaces. You can think of a latent space as a compressed representation that tries to capture the essence of the dataset that we have at hand. You can see a similar latent space method in action here that captures the key features that set different kinds of fonts apart and presents these options on a 2D plane, and here, you see our technique that builds a latent space for modeling a wide range of photorealistic material models that we can explore. And now to this new work. What this tries to do is find a path in the latent space of these images that relates to intuitive concepts like camera zooming, rotation or shifting. That’s not an easy task, but if we can pull it off, we’ll have more artistic control over these generated images, which would be immensely useful for many creative tasks. This new work can perform that, and not only that, but it is also able to learn the concept of color enhancement, and can even increase or decrease the contrast of these images. The key idea of this paper is that this can be done through trying to find crazy, non-linear trajectories in these latent spaces that happen to relate to these intuitive concepts. It is not perfect in a sense that we can indeed zoom in on the picture of this dog, but the posture of the dog also changes, and it even seems like we’re starting out with a puppy that grows up frame by frame. This means that we have learned to navigate this latent space, but there is still some additional fat in these movements, which is a typical side effect of latent space-based techniques and also, don’t forget that the training data the AI is given also has its own limits. However, as you see, we are now one step closer to not only having an AI that synthesizes images for us, but one that does it exactly with the camera setup, rotation, and colors that we are looking for. What a time to be alive! If you wish to see beautiful formulations of walks…walks in latent spaces, that is, make sure to have a look at the paper in the video description. Also, note that we have now appeared on instagram with bite-sized pieces of our bite-sized videos. Yes, it is quite peculiar. Make sure to check it out, just search for two minute papers on instagram or click the link in the video description. Thanks for watching and for your generous support, and I'll see you next time!

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник