# DeepMind's AI Learns To See | Two Minute Papers #263

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=gnctSz2ofU4
- **Дата:** 12.07.2018
- **Длительность:** 2:43
- **Просмотры:** 37,769
- **Источник:** https://ekstraktznaniy.ru/video/14443

## Описание

Pick up cool perks on our Patreon page: https://www.patreon.com/TwoMinutePapers

Crypto and PayPal links are available below. Thank you very much for your generous support!
Bitcoin: 13hhmJnLEzwXgmgJN7RB6bWVdT7WkrFAHh
PayPal: https://www.paypal.me/TwoMinutePapers
Ethereum: 0x002BB163DfE89B7aD0712846F1a1E53ba6136b5A
LTC: LM8AUh5bGcNgzq6HaV1jeaJrFvmKxxgiXg

The papers "Neural scene representation and rendering" and "Gaussian Material Synthesis" are available here:
1. https://deepmind.com/documents/211/Neural_Scene_Representation_and_Rendering_preprint.pdf
2. https://users.cg.tuwien.ac.at/zsolnai/gfx/gaussian-material-synthesis/

We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
313V, Andrew Melnychuk, Angelos Evripiotis, Brian Gilman, Christian Ahlin, Christoph Jadanowski, Dennis Abts, Emmanuel, Eric Haddad, Esa Turkulainen, Geronimo Moralez, Kjartan Olason, Lorin Atzberger, Marten Rauschenberg, Michael Albrecht, Michael Jensen, Morten Punnerud En

## Транскрипт

### Segment 1 (00:00 - 02:00) []

Dear Fellow Scholars, this is Two Minute Papers with Károly Zsolnai-Fehér. This is a recent DeepMind paper on neural rendering where they taught a learning-based technique to see things the way humans do. What's more, it has an understanding of geometry, viewpoints, shadows, occlusion, even self-shadowing and self-occlusion, and many other difficult concepts. So what does this do and how does it work exactly? It contains a representation and a generation network. The representation network takes a bunch of observations, a few screenshots if you will, and encodes this visual sensory data into a concise description that contains the underlying information in the scene. These observations are made from only a handful of camera positions and viewpoints. The neural rendering or seeing part means that we choose a position and viewpoint that the algorithm hasn't seen yet, and ask the generation network to create an appropriate image that matches reality. Now, we have to hold on to our papers for a moment and understand why this is such a crazy idea. Computer graphics researchers work so hard on creating similar rendering and light simulation programs that take tons of computational power to compute all aspects of light transport and in return, give us a beautiful image. If we slightly change the camera angles, we have to redo most the same computations, whereas a learning-based algorithm may just say "don't worry, I got this", and from previous experience, guesses the remainder of the information perfectly. I love it. And what's more, by leaning on what these two networks learned, it generalizes so well that it can even deal with previously unobserved scenes. If you remember, I have also worked on a neural renderer for about 3000 hours and created an AI that predicts photorealistic images perfectly. The difference was that this one took a fixed camera viewpoint, and predicted what the object would look like if we started changing its material properties. I'd love to see a possible combination of these two works, oh my! Super excited for this. There is a link in the video description to both of these works. Can you think of other possible uses for these techniques? Let me know in the comments section! And, if you wish to decide the order of future episodes or get your name listed as a key supporter for the series, hop over to our Patreon page and pick up some cool perks. We use these funds to improve the series and empower other research projects and conferences. As this video series is on the cutting edge of technology, of course, we also support cryptocurrencies like Bitcoin, Ethereum, and Litecoin. The addresses are available in the video description. Thanks for watching and for your generous support, and I'll see you next time!