These Neural Networks Have Superpowers! 💪
7:29

These Neural Networks Have Superpowers! 💪

Two Minute Papers 16.02.2021 149 521 просмотров 10 209 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
❤️ Check out Weights & Biases and sign up for a free demo here: https://www.wandb.com/papers ❤️ Their mentioned post is available here: https://wandb.ai/ayush-thakur/taming-transformer/reports/-Overview-Taming-Transformers-for-High-Resolution-Image-Synthesis---Vmlldzo0NjEyMTY 📝 The paper "Taming Transformers for High-Resolution Image Synthesis" is available here: https://compvis.github.io/taming-transformers/ Tweet links: Website layout: https://twitter.com/sharifshameem/status/1283322990625607681 Plots: https://twitter.com/aquariusacquah/status/1285415144017797126?s=12 Typesetting math: https://twitter.com/sh_reya/status/1284746918959239168 Population data: https://twitter.com/pavtalk/status/1285410751092416513 Legalese: https://twitter.com/f_j_j_/status/1283848393832333313 Nutrition labels: https://twitter.com/lawderpaul/status/1284972517749338112 User interface design: https://twitter.com/jsngr/status/1284511080715362304 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Haro, Alex Serban, Alex Paden, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bruno Mikuš, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Lau, Eric Martel, Gordon Child, Haris Husic, Jace O'Brien, Javier Bustamante, Joshua Goller, Kenneth Davis, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Ramsey Elbasheer, Robin Graham, Steef, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh. If you wish to support the series, click here: https://www.patreon.com/TwoMinutePapers Károly Zsolnai-Fehér's links: Instagram: https://www.instagram.com/twominutepapers/ Twitter: https://twitter.com/twominutepapers Web: https://cg.tuwien.ac.at/~zsolnai/

Оглавление (5 сегментов)

Intro

Dear Fellow Scholars, this is Two Minute  Papers with Dr. Károly Zsolnai-Fehér. I got so excited by the amazing results of this  paper. I will try my best to explain why, and by   the end of this video, there will be a comparison  that blew me away and I hope you will appreciate   it too. With the rise of neural network-based  learning algorithms, we are living the advent   of image generation techniques. What you see  here is a set of breathtaking results created   with a technique called StyleGAN2. This can  generate images of humans, cars, cats, and more. As you see, the progress in machine  learning-based image generation is just stunning.

OpenAIs GPT3

And don’t worry for a second about the progress  in text processing, because that is also   similarly amazing these days. A few months ago,  OpenAI published their GPT-3 model that they   unleashed to read the internet, and learn not just  our language, but much, much more. For instance,   the internet also contains a lot of computer code,  so it learned to generate website layouts from a   written description. But that’s not all, not even  close, to the joy of technical PhD students around   the world, it can properly typeset mathematical  equations from a plain English description as   well. And get this, it can also translate  a complex legal text into plain language,   or, the other way around. And it does many  of these things nearly as well as humans. So what was the key to this  work? One of the keys of GPT-3

Transformer Networks

was that it uses a neural network architecture  that is called the transformer network. These   really took the world by storm in the last few  years, so our first question is, why transformers?    One, transformer networks can typically learn  on stupendously large datasets, like the whole   internet, and extract a lot of information  from it. That is a very good thing. And two,   transformers are attention-based neural networks,  which means that they are good at learning and   generating long sequences of data. Okay, but  how do we benefit from this? Well, when we ask   OpenAI’s GPT-3 to continue our sentences, it  is able to look back at what we have written   previously. And it looks at not just a couple  of characters, no-no, it looks at up to several   pages of writing backwards to make sure that  it continues what we write the best way it can. This sounds amazing. But what is the lesson here?   Just use transformers for everything and off we   go? Well, not quite. They are indeed good at a lot  of things when it comes to text processing tasks,

Image Generation

but they don’t excel at generating high-resolution  images at all. Can this be improved somehow?    Well, this is what this new  technique does, and much, much more. So let’s dive in and see what it can do! First,   we can give it an incomplete  image and ask it to finish it. Not bad… but! OpenAI’s Image-GPT could do that  too, so what else can it do? Oh boy, a lot more!    And by the way, we will compare the results of  this technique against Image-GPT at the end of   this video, make sure not to miss that, I almost  fell off the chair, you will see in a moment why. Two, it can do one of my favorites, depth  to image generation. We give it a depth map,   which is very easy to produce, and it creates  a photorealistic image that corresponds to it,   which is very hard. We do the easy  part, the AI does the hard part. Great!    And with this, we not only get a selection of  these images, but since we have their depth maps,   we can also rotate them around  as if they were 3D objects. Nice! Three, we can also give it a map of  labels, which is, again, very easy to do,   we just say here goes the sea, put  some mountains here, and the sky here,   and it will create a beautiful landscape  image that corresponds to that.    I can’t wait to see what these amazing artists all  over the world will be able to get out of these   techniques, and these results are already  breathtaking…but research is a process,   and just imagine how good they will become  two more papers down the line. My goodness! Four, it can also perform super resolution.   This is the CSI thing where in goes a blurry   image, and out comes a finer, more  detailed version of it. Witchcraft. And finally, five, we can give it a pose, and  it generates humans that take these poses. Now, the important thing here  is that it can supercharge   transformer networks to do these things  at the same time, with just one technique.

Comparison

So how does it compare to OpenAI’s Image  completion technique? Well, remember,   that technique was beyond amazing, and set a  really high bar. So let’s have a look together!    They were both given the upper half of this image,  and had to fill in the lower half. Remember,   as we just learned transformers are not  great at high-resolution image synthesis.    So here, for OpenAI Image-GPT we expect heavily  pixelated images…. and…oh yes, that’s right.    So now, hold on to your papers, and let’s see  how much more detailed the new technique is.    Holy mother of papers! Do you see what I see  here? Image-GPT came out just a few months ago,   and there is already this kind of progress. So  there we go, just imagine what we will be able   to do with these supercharged transformers  just two more papers down the line.    Wow. And that’s where I almost fell off the  chair when reading this paper. Hope you held   on to yours. It truly feels like we are living in  a science fiction world. What a time to be alive! Thanks for watching and for your generous  support, and I'll see you next time!

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник