These Neural Networks Have Superpowers! 💪

7:29

These Neural Networks Have Superpowers! 💪

Two Minute Papers 16.02.2021 149 521 просмотров 10 209 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

❤️ Check out Weights & Biases and sign up for a free demo here: https://www.wandb.com/papers ❤️ Their mentioned post is available here: https://wandb.ai/ayush-thakur/taming-transformer/reports/-Overview-Taming-Transformers-for-High-Resolution-Image-Synthesis---Vmlldzo0NjEyMTY 📝 The paper "Taming Transformers for High-Resolution Image Synthesis" is available here: https://compvis.github.io/taming-transformers/ Tweet links: Website layout: https://twitter.com/sharifshameem/status/1283322990625607681 Plots: https://twitter.com/aquariusacquah/status/1285415144017797126?s=12 Typesetting math: https://twitter.com/sh_reya/status/1284746918959239168 Population data: https://twitter.com/pavtalk/status/1285410751092416513 Legalese: https://twitter.com/f_j_j_/status/1283848393832333313 Nutrition labels: https://twitter.com/lawderpaul/status/1284972517749338112 User interface design: https://twitter.com/jsngr/status/1284511080715362304 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Haro, Alex Serban, Alex Paden, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bruno Mikuš, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Lau, Eric Martel, Gordon Child, Haris Husic, Jace O'Brien, Javier Bustamante, Joshua Goller, Kenneth Davis, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Ramsey Elbasheer, Robin Graham, Steef, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh. If you wish to support the series, click here: https://www.patreon.com/TwoMinutePapers Károly Zsolnai-Fehér's links: Instagram: https://www.instagram.com/twominutepapers/ Twitter: https://twitter.com/twominutepapers Web: https://cg.tuwien.ac.at/~zsolnai/

Оглавление (5 сегментов)

Intro

Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. I got so excited by the amazing results of this paper. I will try my best to explain why, and by the end of this video, there will be a comparison that blew me away and I hope you will appreciate it too. With the rise of neural network-based learning algorithms, we are living the advent of image generation techniques. What you see here is a set of breathtaking results created with a technique called StyleGAN2. This can generate images of humans, cars, cats, and more. As you see, the progress in machine learning-based image generation is just stunning.

OpenAIs GPT3

And don’t worry for a second about the progress in text processing, because that is also similarly amazing these days. A few months ago, OpenAI published their GPT-3 model that they unleashed to read the internet, and learn not just our language, but much, much more. For instance, the internet also contains a lot of computer code, so it learned to generate website layouts from a written description. But that’s not all, not even close, to the joy of technical PhD students around the world, it can properly typeset mathematical equations from a plain English description as well. And get this, it can also translate a complex legal text into plain language, or, the other way around. And it does many of these things nearly as well as humans. So what was the key to this work? One of the keys of GPT-3

Transformer Networks

was that it uses a neural network architecture that is called the transformer network. These really took the world by storm in the last few years, so our first question is, why transformers? One, transformer networks can typically learn on stupendously large datasets, like the whole internet, and extract a lot of information from it. That is a very good thing. And two, transformers are attention-based neural networks, which means that they are good at learning and generating long sequences of data. Okay, but how do we benefit from this? Well, when we ask OpenAI’s GPT-3 to continue our sentences, it is able to look back at what we have written previously. And it looks at not just a couple of characters, no-no, it looks at up to several pages of writing backwards to make sure that it continues what we write the best way it can. This sounds amazing. But what is the lesson here? Just use transformers for everything and off we go? Well, not quite. They are indeed good at a lot of things when it comes to text processing tasks,

Image Generation

but they don’t excel at generating high-resolution images at all. Can this be improved somehow? Well, this is what this new technique does, and much, much more. So let’s dive in and see what it can do! First, we can give it an incomplete image and ask it to finish it. Not bad… but! OpenAI’s Image-GPT could do that too, so what else can it do? Oh boy, a lot more! And by the way, we will compare the results of this technique against Image-GPT at the end of this video, make sure not to miss that, I almost fell off the chair, you will see in a moment why. Two, it can do one of my favorites, depth to image generation. We give it a depth map, which is very easy to produce, and it creates a photorealistic image that corresponds to it, which is very hard. We do the easy part, the AI does the hard part. Great! And with this, we not only get a selection of these images, but since we have their depth maps, we can also rotate them around as if they were 3D objects. Nice! Three, we can also give it a map of labels, which is, again, very easy to do, we just say here goes the sea, put some mountains here, and the sky here, and it will create a beautiful landscape image that corresponds to that. I can’t wait to see what these amazing artists all over the world will be able to get out of these techniques, and these results are already breathtaking…but research is a process, and just imagine how good they will become two more papers down the line. My goodness! Four, it can also perform super resolution. This is the CSI thing where in goes a blurry image, and out comes a finer, more detailed version of it. Witchcraft. And finally, five, we can give it a pose, and it generates humans that take these poses. Now, the important thing here is that it can supercharge transformer networks to do these things at the same time, with just one technique.

Comparison

So how does it compare to OpenAI’s Image completion technique? Well, remember, that technique was beyond amazing, and set a really high bar. So let’s have a look together! They were both given the upper half of this image, and had to fill in the lower half. Remember, as we just learned transformers are not great at high-resolution image synthesis. So here, for OpenAI Image-GPT we expect heavily pixelated images…. and…oh yes, that’s right. So now, hold on to your papers, and let’s see how much more detailed the new technique is. Holy mother of papers! Do you see what I see here? Image-GPT came out just a few months ago, and there is already this kind of progress. So there we go, just imagine what we will be able to do with these supercharged transformers just two more papers down the line. Wow. And that’s where I almost fell off the chair when reading this paper. Hope you held on to yours. It truly feels like we are living in a science fiction world. What a time to be alive! Thanks for watching and for your generous support, and I'll see you next time!

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник