Google’s New AI: These Are More Than Images!
6:35

Google’s New AI: These Are More Than Images!

Two Minute Papers 18.12.2022 100 377 просмотров 4 166 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
❤️ Check out Cohere and sign up for free today: https://cohere.ai/papers 📝 The paper "Self-Distilled StyleGAN: Towards Generation from Internet Photos" is available here: https://self-distilled-stylegan.github.io/ 📝 Our paper with the material synthesis, i.e., "Gaussian Material Synthesis" is available here: https://users.cg.tuwien.ac.at/zsolnai/gfx/gaussian-material-synthesis/ I have decided to try Mastodon. If you are interested, you can follow me there too: https://sigmoid.social/@twominutepapers 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Edward Unthank, Eric Martel, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Jonas, Jonathan, Kenneth Davis, Klaus Busse, Kyle Davis, Lorin Atzberger, Lukas Biewald, Luke Dominique Warner, Matthew Allen Fisher, Matthew Valle, Michael Albrecht, Michael Tedder, Nevin Spoljaric, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Rajarshi Nigam, Ramsey Elbasheer, Richard Sundvall, Steef, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi. If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers Thumbnail background design: Felícia Zsolnai-Fehér - http://felicia.hu Károly Zsolnai-Fehér's links: Mastodon: https://sigmoid.social/@twominutepapers Twitter: https://twitter.com/twominutepapers Web: https://cg.tuwien.ac.at/~zsolnai/

Оглавление (2 сегментов)

Segment 1 (00:00 - 05:00)

Dear Fellow Scholars, this is Two Minute  Papers with Dr. Károly Zsolnai-Fehér. Today we will see that modern  AIs can not only generate images,   but they can also make them come alive. Oh yes. Now, as all of you know, we already have a bunch  of text to image AIs around, OpenAI’s DALL-E 2,   and a free variant, Stable Diffusion and more.   Here, we enter a text prompt, and it paints   a beautiful image for us that satisfies our  prompt. And whenever I create such an image,   I am always wondering, what if we could also  use this AI to actually edit these images or   you know what, maybe even videos to our liking?   Today we will find out whether that is possible. And wait, today, images are not the only thing  we can generate with these techniques. As of   2022 October, Google’s Imagen Video  AI is also out there, and with this,   we have not just text to image, but text  to video. And the results are truly insane. And this work from scientists at Google, this is  a little different. It builds on StyleGAN instead,   this is not a text to image AI, but a generative  model. The visual quality it can create is simply   astounding. Just look at that. And get this,  this new paper just added a bunch of amazing   text to video-like features to it. Now,  it can take a generative model like this,   and create a video for us, but we can  choose what should happen in this video. How? Well, first, we give it a bunch of images,  for instance about horses and it now generates   new, high-quality images within this domain. But, here is the key - these images are generated   by using a latent space. You see an example of a  latent space here from our previous paper which   helps an artist generate similar virtual material  models by exploring with this dot on a 2D plane.    Now, most of these technique guarantee that when  exploring nearby, we will get similar material   models, however, it seems almost impossible  to say for instance, that we wish to see more   metallic materials. We get similar materials, but  we can’t describe exactly what we are looking for. However, hold on to your papers, because this  new technique claims that it can do exactly   that. Well, I will believe it when I see it.   Let’s see. Whoa! We can make this lion roar!    And many others too. And what I really love  about these results is that almost all of   the intermediate images that were generated fit  together so well that it seems like a believable,   continuous video. But that’s not all, it  can do a ton more. For instance, it can   rotate the heads of these parrots, and temporal  consistency is remarkably good here too, which   means that we don’t have a ton of flickering.   Now make no mistake, there is some, but these   are getting closer and closer to being believable  as actual video footage. What a time to be alive! And if we have just the perfect parrot, but it  takes up just a bit too much of the frame, not a   problem, look, zooming in and out of images is  also possible, and it fills in the remaining parts   of the image with sensible information. More of  the parrot or more background. How cool is that! And, if we have created the perfect horse image  with this AI, but we are yearning for a little   more action, not a problem. Look. It can even  make them run. Some of the intermediate images   are not perfect, so this might not pass as  a video, yet. But, as images, incredible. The paper is available in the video description  and it is chock full of these incredible results.    And, you know what, let’s pop the hood  and look inside together! For instance,   they propose a new filtering step that supposedly  has magical powers. Does it? Let’s see. A somewhat   unorganized dataset comes in, for instance, these  can be made by another AI, we bunch them up,   throw it at the new neural network, and after  filtering them down to a smaller subset,   the new technique can generate significantly  better images than other methods could before.    How much better? Let’s see. Whoa. Look at that!   Other AIs prefer the results generated with the   new filtering step. That’s good news, but come  on, we are not computers, we are humans. Does

Segment 2 (05:00 - 06:00)

this really say anything about what humans prefer?   You bet your papers it does! Look! Humans are also   loving the results with the new technique. And  the difference is remarkable. So good. So, welcome   to Two Minute Papers, where we look at tables and  flip out together. Hope you’re enjoying it as much   as I do. So, what do you think? Does this get your  mind going? Let me know in the comments below! Thanks for watching and for your generous  support, and I'll see you next time!

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник