This AI Learned to Summarize Videos 🎥
5:23

This AI Learned to Summarize Videos 🎥

Two Minute Papers 18.04.2020 94 596 просмотров 4 762 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
❤️ Check out Linode here and get $20 free credit on your account: https://www.linode.com/papers 📝 The paper "CLEVRER: CoLlision Events for Video REpresentation and Reasoning" is available here: http://clevrer.csail.mit.edu/ ❤️ Watch these videos in early access on our Patreon page or join us here on YouTube: - https://www.patreon.com/TwoMinutePapers - https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg/join 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Alex Haro, Alex Paden, Andrew Melnychuk, Angelos Evripiotis, Anthony Vdovitchenko, Benji Rabhan, Brian Gilman, Bryan Learn, Christian Ahlin, Daniel Hasegan, Dennis Abts, Eric Haddad, Eric Martel, Evan Breznyik, Geronimo Moralez, James Watt, Javier Bustamante, Kaiesh Vohra, Kasia Hayden, Kjartan Olason, Levente Szabo, Lorin Atzberger, Lukas Biewald, Marcin Dukaczewski, Marten Rauschenberg, Maurits van Mastrigt, Michael Albrecht, Michael Jensen, Nader Shakerin, Owen Campbell-Moore, Owen Skarpness, Raul Araújo da Silva, Rob Rowe, Robin Graham, Ryan Monsurate, Shawn Azman, Steef, Steve Messina, Sunil Kim, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh. https://www.patreon.com/TwoMinutePapers Thumbnail background image credit: https://pixabay.com/images/id-95032/ Neural network image credit: https://en.wikipedia.org/wiki/Neural_network Meet and discuss your ideas with other Fellow Scholars on the Two Minute Papers Discord: https://discordapp.com/invite/hbcTJu2 Károly Zsolnai-Fehér's links: Instagram: https://www.instagram.com/twominutepapers/ Twitter: https://twitter.com/karoly_zsolnai Web: https://cg.tuwien.ac.at/~zsolnai/

Оглавление (2 сегментов)

Segment 1 (00:00 - 05:00)

Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. Neural network-based learning algorithms are making great leaps in a variety of areas. And many of us are wondering whether it is possible that one day we’ll get a learning algorithm, show it a video, and ask it to summarize it, and we can then decide whether we wish to watch it or not? Or just describe what we are looking for and it would fetch the appropriate videos for us. I think today’s paper has a good pointer whether we can expect this to happen, and in a few moments, we’ll find out together why. A few years ago, these neural networks were mainly used for image classification, or in other words, they would tell us what kinds of objects are present in an image. But they are capable of so much more, for instance, these days, we can get a recurrent neural network write proper sentences about images, and it would work well for even highly non-trivial cases. For instance, it is able to infer that work is being done here, or that a ball is present in this image even if the vast majority of the ball itself is concealed. The even crazier thing about this is that this work is not recent at all, this is from a more than 4 year old paper! Insanity. The first author of this paper was Andrej Karpathy, one of the best minds in the game who is currently the director of AI at Tesla and works on making these cars able to drive themselves. So, as amazing as this work was, the progress in machine learning research keeps on accelerating, so let’s have a look at this newer paper that takes it a step further, and has a look at not an image, but a video, and explains what happens therein. Very exciting. Let’s have a look at an example! This was the input video, and let’s stop right at the first statement. The red sphere enters the scene. So, it was able to correctly identify not only what we are talking about in terms of color and shape, but also knows what this object is doing as well. That’s a great start. Let’s proceed further. Now, it correctly identifies the collision event with the cylinder, then this cylinder hits another cylinder, very good… and look at that. It identifies that the cylinder is made of metal, I like that a lot, because this particular object is made of a very reflective material, which shows us more about the surrounding room than the object itself. But we shouldn’t only let the AI tell us what is going on its own terms - let’s ask questions and see if it can answer them correctly. So, first, let’s ask - what is the material of the last object that hit the cyan cylinder? And it correctly finds that the answer is Metal. Awesome. Now let’s take it a step further and stop the video here - can it predict what is about to happen after this point? Look, it indeed can! This is remarkable because of two things. If we look under the hood, we see that to be able to pull this off, it not only has to understand what objects are present in the video and predict how they will interact, but also has to parse our questions correctly, put it all together, and form an answer based on all this information. If any of these tasks works unreliably, the answers will be incorrect. And two, there are many other techniques that are able to do some of these tasks, so why is this one particularly interesting? Well, look here! This new method is able to do all of these tasks at the same time. So there we go, if this improves further, we might become able to search Youtube videos by just typing something that happens in the video and it would be able to automatically find it for us. That would be absolutely amazing. What a time to be alive! This episode has been supported by Linode. Linode is the world’s largest independent cloud computing provider. Unlike entry-level hosting services, Linode gives you full backend access to your server, which is your step up to powerful, fast, fully configurable cloud computing. Linode also has One-Click Apps that streamline your ability to deploy websites, personal VPNs, game servers, and more. If you need something as small as a personal online portfolio, Linode has your back, and if you need to manage tons of client’s websites and reliably serve them to millions of visitors, Linode can do that too. What’s more, they offer affordable GPU instances featuring the Quadro RTX 6000 which is tailor-made for AI, scientific computing and computer graphics projects. If only I had access to a tool like

Segment 2 (05:00 - 05:00)

this while I was working on my last few papers! To receive $20 in credit on your new Linode account, visit  linode. com/papers  or click the link in the description and give it a try today! Our thanks to Linode for supporting the series and helping us make better videos for you. Thanks for watching and for your generous support, and I'll see you next time!

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник