DeepMind's AI Learned a Better Understanding of 3D Scenes

3:19

DeepMind's AI Learned a Better Understanding of 3D Scenes

Two Minute Papers 07.05.2019 44 074 просмотров 1 933 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Backblaze: https://www.backblaze.com/cloud-backup.html#af9tk4 📝 The paper "MONet: Unsupervised Scene Decomposition and Representation" is available here: https://arxiv.org/abs/1901.11390 ❤️ Pick up cool perks on our Patreon page: https://www.patreon.com/TwoMinutePapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: 313V, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Anthony Vdovitchenko, Brian Gilman, Bruno Brito, Bryan Learn, Christian Ahlin, Christoph Jadanowski, Claudio Fernandes, Dennis Abts, Eric Haddad, Eric Martel, Evan Breznyik, Geronimo Moralez, James Watt, Javier Bustamante, John De Witt, Kaiesh Vohra, Kasia Hayden, Kjartan Olason, Levente Szabo, Lorin Atzberger, Marcin Dukaczewski, Marten Rauschenberg, Maurits van Mastrigt, Michael Albrecht, Michael Jensen, Nader Shakerin, Owen Campbell-Moore, Owen Skarpness, Raul Araújo da Silva, Richard Reis, Rob Rowe, Robin Graham, Ryan Monsurate, Shawn Azman, Steef, Steve Messina, Sunil Kim, Thomas Krcmar, Torsten Reil, Zach Boldyga. https://www.patreon.com/TwoMinutePapers Background image credit: https://pixabay.com/hu/photos/világ%C3%ADtótorony-magyarország-2542726/ Splash screen/thumbnail design: Felícia Fehér - http://felicia.hu Károly Zsolnai-Fehér's links: Facebook: https://www.facebook.com/TwoMinutePapers/ Twitter: https://twitter.com/karoly_zsolnai Web: https://cg.tuwien.ac.at/~zsolnai/

Оглавление (5 сегментов)

Intro

dear fellow scholars this is two minute papers with károly on alpha here this paper was written by scientists at deep mind and it is about teaching an AI to look at a 3d scene and decompose it into its individual elements in a meaningful manner this is typically one of those tasks that is easy to do for humans and is immensely difficult for machines as this that compositional thing still sounds a little nebulous let me explain what it means here you see an example scene and the segmentation of this scene that the AI came up with which shows what it thinks where the boundaries of the individual objects are however we are not stopping there because it is also able to rip out these objects from the scene one by one so why is this such a big deal well because of three things

generative model

one it is a generative model meaning that it is able to reorganize these scenes and create new content that actually makes sense - it can prove that it truly has an understanding of 3d scenes by demonstrating that it can deal with occlusions for instance if we ask it to rip out the blue cylinder from this scene it is able to reconstruct parts of it that weren't even visible in the original scene with the blue sphere here amazing isn't it and three this one is a bombshell it is an unsupervised learning technique now our more seasoned fellow scholars fell out of the chair hearing this but

training data

just in case this means that this algorithm is able to learn on its own and we have to feed it a ton of training data but this training data is not labeled it just looks at the videos with no additional information and from watching all this content it finds out on its own about the concept of these individual objects the main motivation

motivation

to create such an algorithm was to have an AI look at some gameplay of the Starcraft 2 strategy game and be able to recognize all individual units and the background without any additional supervision I really hope this also means that deep mind is working on a version of their Starcraft 2 AI that is able to learn more similarly to how a human does which is looking at the pixels of the game if you look at the details this will seem almost unfathomably difficult but would of course make me unreasonably happy what a time to be alive if you check out the paper in the video description you will find how all this is possible through a creative combination of an attention network and a variational auto encoder

backblaze

this episode has been supported by Backblaze is an unlimited online backup solution for only $6 a month and I have been using it for years to make sure my personal data family pictures and the materials required to create this series are safe you can try it free of charge for 15 days and if you don't like it you can immediately cancel it without losing anything make sure to sign up for back plays today through the link in the video description and this way you not only keep your personal data safe but you also help supporting this series thanks for watching and for your generous support and I'll see you next time

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник