# AI Builds 3D Models From Images With a Twist | Two Minute Papers #129

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=kf-KViOuktc
- **Дата:** 19.02.2017
- **Длительность:** 3:43
- **Просмотры:** 21,890

## Описание

The paper "IM2CAD" is available here:
http://homes.cs.washington.edu/~izadinia/im2cad.html

LSUN Challenge datasets: http://lsun.cs.princeton.edu/2016/

More related papers are available here:
http://www.cs.toronto.edu/~fidler/projects/rent3D.html
http://web.engr.illinois.edu/~slazebni/publications/iccv15_informative.pdf
http://ieeexplore.ieee.org/document/6619238/?reload=true

WE WOULD LIKE TO THANK OUR GENEROUS PATREON SUPPORTERS WHO MAKE TWO MINUTE PAPERS POSSIBLE:
Sunil Kim, Daniel John Benton, Dave Rushton-Smith.
https://www.patreon.com/TwoMinutePapers

Subscribe if you would like to see more of these! - http://www.youtube.com/subscription_center?add_user=keeroyz

Music: Dat Groove by Audionautix is licensed under a Creative Commons Attribution license (https://creativecommons.org/licenses/by/4.0/)
Artist: http://audionautix.com/

Thumbnail background image credit: https://pixabay.com/photo-389254/
Splash screen/thumbnail design: Felícia Fehér - http://felicia.hu

Károly Zsolnai-Fehér's links:
Facebook → https://www.facebook.com/TwoMinutePapers/
Twitter → https://twitter.com/karoly_zsolnai
Web → https://cg.tuwien.ac.at/~zsolnai/

## Содержание

### [0:00](https://www.youtube.com/watch?v=kf-KViOuktc) Segment 1 (00:00 - 03:00)

Dear Fellow Scholars, this is Two Minute Papers with Károly Zsolnai-Fehér. CAD stands for computer aided design, basically a digital 3D model of a scene. Here is an incredibly difficult problem: what if we give the computer a photograph of a living room, and the output would be a digital, fully modeled 3D scene. A CAD model. This is a remarkably difficult problem: just think about it! The algorithm would have to have an understanding of perspective, illumination, occlusions, and geometry. And with that in mind, have a look at these incredible results. Now, this clearly sounds impossible. Not so long ago we talked about a neural network-based technique that tried to achieve something similar and it was remarkable, but the output was a low resolution voxel array, which is kind of like an approximate model built by children from a few large lego pieces. The link to this work is available in the video description. But clearly, we can do better, so how could this be possible? Well, the most important observation is that if we take a photograph of a room, there is a high chance that the furniture within are not custom built, but mostly commercially available pieces. So who said that we have to build these models from scratch? Let's look into a database that contains the geometry for publicly available furniture pieces and find which ones are seen in the image! So here's what we do: given a large amount of training samples, neural networks are adept at recognizing objects on a photograph. That would be step number one. After the identification, the algorithm knows where the object is, now we're interested in what it looks like and how it is aligned. And then, we start to look up public furniture databases for objects that are as similar to the ones presented in the photo as possible. Finally, we put everything in its appropriate place, and create a new digital image with a light simulation program. This is an iterative algorithm, which means that it starts out with a coarse initial guess that is being refined many times until some sort of convergence is reached. Convergence means that no matter how hard we try, only minor improvements can be made to this solution. Then, we can stop. And here, the dissimilarity between the photograph and the digitally rendered image was subject to minimization. This entire process of creating the 3D geometry of the scene takes around 5 minutes. And this technique can also estimate the layout of a room from this one photograph. Now, this algorithm is absolutely amazing, but of course, the limitations are also to be candidly discussed. While some failure cases arise from misjudging the alignment of the objects, the technique is generally quite robust. Non-cubic room shapes are also likely to introduce issues such as the omission or misplacement of an object. Also, kitchens and bathrooms are not yet supported. Note that this is not the only paper solving this problem, I've made sure to link some more related papers in the video description for your enjoyment. If you have found this interesting, make sure to subscribe and stay tuned for more Two Minute Papers episodes! Thanks for watching and for your generous support, and I'll see you next time!

---
*Источник: https://ekstraktznaniy.ru/video/14708*