Google’s New AI: These Are More Than Images!

6:35

Google’s New AI: These Are More Than Images!

Two Minute Papers 18.12.2022 100 377 просмотров 4 166 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

❤️ Check out Cohere and sign up for free today: https://cohere.ai/papers 📝 The paper "Self-Distilled StyleGAN: Towards Generation from Internet Photos" is available here: https://self-distilled-stylegan.github.io/ 📝 Our paper with the material synthesis, i.e., "Gaussian Material Synthesis" is available here: https://users.cg.tuwien.ac.at/zsolnai/gfx/gaussian-material-synthesis/ I have decided to try Mastodon. If you are interested, you can follow me there too: https://sigmoid.social/@twominutepapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Edward Unthank, Eric Martel, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Jonas, Jonathan, Kenneth Davis, Klaus Busse, Kyle Davis, Lorin Atzberger, Lukas Biewald, Luke Dominique Warner, Matthew Allen Fisher, Matthew Valle, Michael Albrecht, Michael Tedder, Nevin Spoljaric, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Rajarshi Nigam, Ramsey Elbasheer, Richard Sundvall, Steef, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi. If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers Thumbnail background design: Felícia Zsolnai-Fehér - http://felicia.hu Károly Zsolnai-Fehér's links: Mastodon: https://sigmoid.social/@twominutepapers Twitter: https://twitter.com/twominutepapers Web: https://cg.tuwien.ac.at/~zsolnai/

Оглавление (2 сегментов)

Segment 1 (00:00 - 05:00)

Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. Today we will see that modern AIs can not only generate images, but they can also make them come alive. Oh yes. Now, as all of you know, we already have a bunch of text to image AIs around, OpenAI’s DALL-E 2, and a free variant, Stable Diffusion and more. Here, we enter a text prompt, and it paints a beautiful image for us that satisfies our prompt. And whenever I create such an image, I am always wondering, what if we could also use this AI to actually edit these images or you know what, maybe even videos to our liking? Today we will find out whether that is possible. And wait, today, images are not the only thing we can generate with these techniques. As of 2022 October, Google’s Imagen Video AI is also out there, and with this, we have not just text to image, but text to video. And the results are truly insane. And this work from scientists at Google, this is a little different. It builds on StyleGAN instead, this is not a text to image AI, but a generative model. The visual quality it can create is simply astounding. Just look at that. And get this, this new paper just added a bunch of amazing text to video-like features to it. Now, it can take a generative model like this, and create a video for us, but we can choose what should happen in this video. How? Well, first, we give it a bunch of images, for instance about horses and it now generates new, high-quality images within this domain. But, here is the key - these images are generated by using a latent space. You see an example of a latent space here from our previous paper which helps an artist generate similar virtual material models by exploring with this dot on a 2D plane. Now, most of these technique guarantee that when exploring nearby, we will get similar material models, however, it seems almost impossible to say for instance, that we wish to see more metallic materials. We get similar materials, but we can’t describe exactly what we are looking for. However, hold on to your papers, because this new technique claims that it can do exactly that. Well, I will believe it when I see it. Let’s see. Whoa! We can make this lion roar! And many others too. And what I really love about these results is that almost all of the intermediate images that were generated fit together so well that it seems like a believable, continuous video. But that’s not all, it can do a ton more. For instance, it can rotate the heads of these parrots, and temporal consistency is remarkably good here too, which means that we don’t have a ton of flickering. Now make no mistake, there is some, but these are getting closer and closer to being believable as actual video footage. What a time to be alive! And if we have just the perfect parrot, but it takes up just a bit too much of the frame, not a problem, look, zooming in and out of images is also possible, and it fills in the remaining parts of the image with sensible information. More of the parrot or more background. How cool is that! And, if we have created the perfect horse image with this AI, but we are yearning for a little more action, not a problem. Look. It can even make them run. Some of the intermediate images are not perfect, so this might not pass as a video, yet. But, as images, incredible. The paper is available in the video description and it is chock full of these incredible results. And, you know what, let’s pop the hood and look inside together! For instance, they propose a new filtering step that supposedly has magical powers. Does it? Let’s see. A somewhat unorganized dataset comes in, for instance, these can be made by another AI, we bunch them up, throw it at the new neural network, and after filtering them down to a smaller subset, the new technique can generate significantly better images than other methods could before. How much better? Let’s see. Whoa. Look at that! Other AIs prefer the results generated with the new filtering step. That’s good news, but come on, we are not computers, we are humans. Does

Segment 2 (05:00 - 06:00)

this really say anything about what humans prefer? You bet your papers it does! Look! Humans are also loving the results with the new technique. And the difference is remarkable. So good. So, welcome to Two Minute Papers, where we look at tables and flip out together. Hope you’re enjoying it as much as I do. So, what do you think? Does this get your mind going? Let me know in the comments below! Thanks for watching and for your generous support, and I'll see you next time!

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник