# AI Learns 3D Face Reconstruction | Two Minute Papers #198

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=9BOdng9MpzU
- **Дата:** 19.10.2017
- **Длительность:** 3:15
- **Просмотры:** 32,172
- **Источник:** https://ekstraktznaniy.ru/video/14572

## Описание

The paper "Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression" is available here:
http://aaronsplace.co.uk/papers/jackson2017recon/

Online demo:
http://cvl-demos.cs.nott.ac.uk/vrn/

Source code:
https://github.com/AaronJackson/vrn

We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Andrew Melnychuk, Brian Gilman, Dave Rushton-Smith, Dennis Abts, Eric Haddad, Esa Turkulainen, Evan Breznyik, Kaben Gabriel Nanlohy, Malek Cellier, Michael Albrecht, Michael Jensen, Michael Orenstein, Steef, Sunil Kim, Torsten Reil.
https://www.patreon.com/TwoMinutePapers

Two Minute Papers Merch:
US: http://twominutepapers.com/
EU/Worldwide: https://shop.spreadshirt.net/TwoMinutePapers/

Music: Antarctica by Audionautix is licensed under a Creative Commons Attribution license (https://creativecommons.org/licenses/by/4.0/)
Artist: http://audionautix.com/ 

Thumbnail background image credit: https://pixabay.com/photo-1836445/
Sp

## Транскрипт

### Segment 1 (00:00 - 03:00) []

Dear Fellow Scholars, this is Two Minute Papers with Károly Zsolnai-Fehér. Now that facial recognition is becoming more and more of a hot topic, let's talk a bit about 3D face reconstruction! This is a problem where we have a 2D input photograph, or a video of a person, and the goal is to create a piece of 3D geometry from it. To accomplish this, previous works often required a combination of proper alignment of the face, multiple photographs and dense correspondences, which is a fancy name for additional data that identifies the same regions across these photographs. But this new formulation is the holy grail of all possible versions of this problem, because it requires nothing else but one 2D photograph. The weapon of choice for this work was a Convolutional Neural Network, and the dataset the algorithm was trained on couldn't be simpler: it was given a large database of 2D input image and 3D output geometry pairs. This means that the neural network can look at a lot of these pairs and learn how these input photographs are mapped to 3D geometry. And as you can see, the results are absolutely insane, especially given the fact that it works for arbitrary face positions and many different expressions, and even with occlusions. However, this is not your classical Convolutional Neural Network, because as we mentioned, the input is 2D and the output is 3D. So the question immediately arises: what kind of data structure should be used for the output? The authors went for a 3D voxel array, which is essentially a cube in which we build up the face from small, identical Lego pieces. This representation is similar to the terrain in the game Minecraft, only the resolution of these blocks is finer. The process of guessing how these voxel arrays should look based the input photograph is referred to in the research community as volumetric regression. This is what this work is about. And now comes the best part! An online demo is also available where we can either try some prepared images, or, we can also upload our own. So while I run my own experiments, don't leave me out of the good stuff and make sure you post your results in the comments section! The source code is also available for you fellow tinkerers out there. The limitations of this technique includes the inability of detecting expressions that are very far away from the ones seen in the training set, and as you can see in the videos, temporal coherence could also use some help. This means that if we have video input, the reconstruction has some tiny differences in each frame. Maybe a Recurrent Neural Network, like some variant of Long Short Term Memory could address this in the near future. However, those are trickier and more resource-intensive to train properly. Very excited to see how these solutions evolve, and of course, Two Minute Papers is going to be here for you to talk about some amazing upcoming works. Thanks for watching and for your generous support, and I'll see you next time!
