# Google's New AI: Dog Goes In, Statue Comes Out! 🗽

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=NnoTWZ9qgYg
- **Дата:** 25.09.2022
- **Длительность:** 8:20
- **Просмотры:** 132,254

## Описание

❤️ Check out Fully Connected by Weights & Biases: https://wandb.me/papers 

📝 The paper "DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation" is available here:
https://dreambooth.github.io/

Try it out:
1. https://huggingface.co/sd-dreambooth-library
2. https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/sd_dreambooth_inference.ipyn

AI Image interpolation:
https://twitter.com/xsteenbrugge/status/1558508866463219712

Felícia Zsolnai-Fehér’s works:
https://twitter.com/twominutepapers/status/1534817417238614017

Judit Somogyvári’s works:
https://www.artstation.com/sheyenne
https://www.instagram.com/somogyvari.art/

❤️ Watch these videos in early access on our Patreon page or join us here on YouTube: 
- https://www.patreon.com/TwoMinutePapers
- https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg/join

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Geronimo Moralez, Gordon Child, Ivo Galic, Jace O'Brien, Jack Lukic, John Le, Jonas, Jonathan, Kenneth Davis, Klaus Busse, Kyle Davis, Lorin Atzberger, Lukas Biewald, Luke Dominique Warner, Matthew Allen Fisher, Michael Albrecht, Michael Tedder, Nevin Spoljaric, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Rajarshi Nigam, Ramsey Elbasheer, Steef, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers

Thumbnail background design: Felícia Zsolnai-Fehér - http://felicia.hu

Károly Zsolnai-Fehér's links:
Instagram: https://www.instagram.com/twominutepapers/
Twitter: https://twitter.com/twominutepapers
Web: https://cg.tuwien.ac.at/~zsolnai/

## Содержание

### [0:00](https://www.youtube.com/watch?v=NnoTWZ9qgYg) Intro

Dear Fellow Scholars, this is Two Minute  Papers with Dr. Károly Zsolnai-Fehér. Today you are going to see how Google’s new  AI just supercharged art generation. Again! Yes, this year, we are entering the age  of AI-based art generation. OpenAI’s   DALL-E 2 technique is able to  take a piece of text from us,   and generate a stunning image  that matches this description.    Stable Diffusion, a similar, but open source  technique is also now available for everyone   to use. And the results are so good that artists  are already using it around the world to create   illustrations for a novel, texture synthesis  for virtual worlds, product design, and more. So, are we done here? Is there  nothing else to improve other   than the visual fidelity of  the results? Well, not quite! Have a look at two of my favorite  AI-generated images, this scholar   who is desperately trying to hold on to his  papers. So, I am very happy with this image,   but imagine that if we were creating a comic,  we would need more images of this chap doing   other things. Can we do that? Well, we  are experienced Fellow Scholars here,   so, we know that variant generation comes  to the rescue, right? Well, have a look.

### [1:27](https://www.youtube.com/watch?v=NnoTWZ9qgYg&t=87s) Variant Generation

Clearly, the AI has an understanding of the image,   and can create somewhat similar images: let’s  see - we get someone with a similar beard,   similar paper which is similarly on fire.   The fact that the AI can have a look at such   an image and create variants is a miracle of  science…but not quite what we are looking for. Why is that? Well, of course,   this is a new scholar. We are looking for  the previous scholar doing other things. Now let’s try our fox scientist too and see if  this was maybe just an anomaly? Maybe DALL-E 2   is just not into our Scholarly content? Let’s see!   Well, once again, the results are pretty good. It   understood that the huge ears, gloves, labcoat  and the tie are important elements of the image,   but ultimately, this is a different  fox scientist in a different style. So, no more adventures for these scientists,   right? Have we lost them  forever? Should we give up? Well, not so fast! Have a look at Google’s new  technique, which promises a solution to this

### [2:42](https://www.youtube.com/watch?v=NnoTWZ9qgYg&t=162s) Synthesis

challenging problem! Yes, they promise that if we  are able to take about 4 images of our subject,   for instance, this good boy here, it  will be able to synthesize completely   new images with them. Let’s see, this  is the same dog in the Acropolis. That is excellent. However… wait a minute! This  photo is very similar to this one. So basically,   there was a little synthesis going on here,  but otherwise, changing the background carries   the show here. That is now new. We can do  that with already existing tools anyway. So, is that it? Well, hold on to your  papers, because the answer is no,   it goes beyond changing backgrounds. It  goes so far beyond that I don’t even   know where to start! For instance, here  is our little doggy swimming, sleeping,   in a bucket, and we can even give this good boy  a haircut. That is absolutely insane. And all of   these new situations were synthesized by using  only 4 input photos. Wow. Now we’re talking! Similarly, if we have a pair of stylish  sunglasses, we can ask a bear to wear it,   make a cool product photo out of it, or put  it in front of the Eiffel Tower. And as I am   a light transport researcher by trade, I have  to note that even secondary effects, like the   reflection of the glasses here are modeled really  well. And so are the reflections here. Loving it. But, it can do so much more than this. Here  are 5 of my favorite examples from the paper. One, we can not only put our favorite teapot  into different contexts, or see it in use,   but we can even reimagine an otherwise  opaque object and see what it would look   like if it were made of a transparent  material, like glass. I love it,   and I bet that product design  people will also love it too. Two, we can create art renditions of our  test subject. Here, the input is only   three photos of a dog, but the output, the  output is priceless. We can commission art   renditions from legendary artists of the past,  and all this nearly for free. How cool is that? Three, I hope you liked this teapot property  modification concept, because we are going   to push it quite a bit further! For  instance, before repainting our car,   we can have a closer look at what it would look  like, and we can also reimagine our favorite   little pets as other animals. Which one is  your favorite? Let me know in the comments   below. Mine is the hippo. It has to be  the hippo. Look at how adorable it is! And if even that is not enough, four, we can also  ask the AI to reimagine our little dog as a chef,   a nurse, a police dog, and many others.   And all of these images are fantastic! And finally, five it can perform no less than  views synthesis too! We know from previous papers

### [6:03](https://www.youtube.com/watch?v=NnoTWZ9qgYg&t=363s) View Synthesis

that this is quite challenging, and this one only  requests four photos of our cat, and of course,   if the cat is refusing to turn the right  direction, which happens basically every time,   well, no matter, the AI can resynthesize an  image of it looking into our desired directions.    This one looks like something straight out of a  science fiction movie. What a time to be alive! And this is a huge step forward, you see,   previous techniques typically struggled  with this, even in the best cases,   the fidelity of the results suffer. This alarm  clock is not the same as our input photo was.    And neither is this one. But, with the new  technique, now we’re talking! Bravo Google! And once again, don’t forget, the First  Law of Papers is on full display here:   this huge leap happened just one more paper  down the line. I am truly stunned by these   results and I cannot even imagine what we will  be able to do 2 more papers down the line! So, what do you think? What would you use  this for? Let me know in the comments below! Thanks for watching and for your generous  support, and I'll see you next time!

---
*Источник: https://ekstraktznaniy.ru/video/13437*