# This AI Learns About Movement By Watching Frozen People

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=prMk6Znm4Bc
- **Дата:** 31.07.2019
- **Длительность:** 2:38
- **Просмотры:** 64,755

## Описание

📝 The paper "Learning the Depths of Moving People by Watching Frozen People" is available here:
https://mannequin-depth.github.io/

❤️ Pick up cool perks on our Patreon page: https://www.patreon.com/TwoMinutePapers

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
313V, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Anthony Vdovitchenko, Brian Gilman, Bruno Brito, Bryan Learn, Christian Ahlin, Christoph Jadanowski, Claudio Fernandes, Daniel Hasegan, Dennis Abts, Eric Haddad, Eric Martel, Evan Breznyik, Geronimo Moralez, Ivelin Ivanov, James Watt, Javier Bustamante, John De Witt, Kaiesh Vohra, Kasia Hayden, Kjartan Olason, Levente Szabo, Lorin Atzberger, Lukas Biewald, Marcin Dukaczewski, Marten Rauschenberg, Maurits van Mastrigt, Michael Albrecht, Michael Jensen, Nader Shakerin, Owen Campbell-Moore, Owen Skarpness, Raul Araújo da Silva, Rob Rowe, Robin Graham, Ryan Monsurate, Shawn Azman, Steef, Steve Messina, Sunil Kim, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Zach Boldyga.
https://www.patreon.com/TwoMinutePapers

Splash screen/thumbnail design: Felícia Fehér - http://felicia.hu

Károly Zsolnai-Fehér's links:
Instagram: https://www.instagram.com/twominutepapers/
Twitter: https://twitter.com/karoly_zsolnai
Web: https://cg.tuwien.ac.at/~zsolnai/

## Содержание

### [0:00](https://www.youtube.com/watch?v=prMk6Znm4Bc) <Untitled Chapter 1>

Dear Fellow Scholars, this is Two Minute Papers with Károly Zsolnai-Fehér. This paper is about endowing colored images with depth information, which is typically done through depth maps. Depth maps describe how far parts of the scene are from the camera and are given with a color coding where the darker the colors are, the further away the objects are. These depth maps can be used to apply a variety of effects to the image that require knowledge about the distance of the objects within - for instance, selectively defocusing parts of the image, or even removing people and inserting new objects to the scene. If we, humans look at an image, we have an intuitive understanding of its contents and have the knowledge to produce a depth map by pen and paper. However, this would, of course, be infeasible and would take too long, so we would prefer

### [0:49](https://www.youtube.com/watch?v=prMk6Znm4Bc&t=49s) Results on the Mannequin Challenge Test Set

a machine to do it for us instead. But of course, machines don’t understand the concept of 3D geometry so they probably cannot help us with this. Or, with the power of machine learning algorithms, can they? This new paper from scientists at Google Research attempts to perform this, but, with a twist.

### [1:07](https://www.youtube.com/watch?v=prMk6Znm4Bc&t=67s) Approach: Learn the depths of moving people by watching frozen people

The twist is that a learning algorithm is unleashed on a dataset of what they call mannequins, or in other words, real humans are asked to stand around frozen in a variety of different positions while the camera moves around in the scene. The goal is that the algorithm would have a look at these frozen people and take into consideration the parallax of the camera movement. This means that the objects closer to the camera move more than other objects that are further away. And turns out, this kind of knowledge can be exploited, so much so that if we train our AI properly, it will be able to predict the depth maps of people that are moving around, even if it had only seen these frozen mannequins before. This is particularly difficult because if we have an animation, we have to make sure that the guesses are consistent across time, or else we get these annoying flickering effects that you see here with previous techniques. It is still there with the new method, especially for the background, but the improvement on the human part of the image is truly remarkable. Beyond the removal and insertion techniques we talked about earlier, I am also really

### [2:21](https://www.youtube.com/watch?v=prMk6Znm4Bc&t=141s) Removing Humans for View Synthesis

excited for this method as it may open up the possibility of creating video versions of these amazing portrait mode images with many of the newer smartphones people have in their pockets. Thanks for watching and for your generous support, and I'll see you next time!

---
*Источник: https://ekstraktznaniy.ru/video/14275*