# Google’s AI: This Should Be Impossible!

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=bD_HyxHMHPo
- **Дата:** 19.10.2023
- **Длительность:** 6:05
- **Просмотры:** 213,956

## Описание

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers

📝 The paper "RealFill - Reference-Driven Generation for Authentic Image Completion " is available here:
https://realfill.github.io/
Unofficial implementation: https://github.com/thuanz123/realfill

My latest paper on simulations that look almost like reality is available for free here:
https://rdcu.be/cWPfD 

Or this is the orig. Nature Physics link with clickable citations:
https://www.nature.com/articles/s41567-022-01788-5

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bret Brizzee, Bryan Learn, B Shang, Christian Ahlin, Gaston Ingaramo, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Kenneth Davis, Klaus Busse, Kyle Davis, Lukas Biewald, Martin, Matthew Valle, Michael Albrecht, Michael Tedder, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Rajarshi Nigam, Ramsey Elbasheer, Richard Sundvall, Steef, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers

Thumbnail background design: Felícia Zsolnai-Fehér - http://felicia.hu
Károly Zsolnai-Fehér's research works: https://cg.tuwien.ac.at/~zsolnai/

## Содержание

### [0:00](https://www.youtube.com/watch?v=bD_HyxHMHPo) Segment 1 (00:00 - 05:00)

This idea sounds like science fiction, but by  the end of the video, you will see that this   makes perfect sense. So, what is it? Well,  consider image inpainting. It is amazing.    What can it do? Well, when we cut out a part  of an image, then, bam! It fills in the void   with plausible information. This is image  inpainting. And now it works on video too.    But it gets even crazier. Image outpainting  also works. Whoa! What is that? Well, we can   essentially extend the image in any direction  and once again, fill it in with plausible data. Now please note the choice of words here: in  both cases I said it fills it in with plausible   data. Data that could be there. But synthetic  data nonetheless. Now here is an insane idea   from Google’s researchers. What if we would take  these, and fill them in not with information   that could have been there, but with information  that was actually there. Filling in with reality.    That is of course, impossible, right? Well,  look. Oh yes. It seems like it is impossible. If we try to complete this image with previous  techniques, for instance, Stable Diffusion,   we get something that is plausible, you know,  the hat continues, the post its also continue,   that is good, but still it is likely not the  real thing. So, can I get the real thing? Well, let’s think together. What if we are trying  to outpaint a historical building? Wait a minute,   that is the key! If we try to fill in information  for something that we have other photos for,   it might be possible. Let’s give it a  try. This is the incomplete input photo,   and here are our other photos. Now note that this  is still quite hard. We can’t just copy it. The   angles are different, the lighting is different,  lens distortion is really different. But,   in the age of AI, let’s see if it can be  done. And…oh wow! Look at that! Perfection. And it can do it for a variety of scenes  over and over again. It appears to work   pretty much everywhere. Well,  it does not work everywhere,   I’ll tell you about it in a moment.   But, all this is absolutely amazing. But still, wait. How do we know how real these  photos are if there is nothing to compare to?    Well, let’s make sure that there is something to  compare to. Let’s take a real photo, cut off the   top, and now we know exactly what should be there.   Stable Diffusion does not know. Paint by Example,   a paper from almost exactly a year ago  does not know at all. But the new technique   called RealFill, this one knows. Look. That is  incredible. Almost pixel perfect reconstruction.    My goodness. What a time to be alive! Now  note that this is not a copying machine,   it has access to information about the room,  but it has to understand which part is missing,   and what that part would look like from  this angle. So it fills in reality after   all. And it does it over and over  again with breathtaking accuracy. Now, I noted that it is still not perfect. I mean,   all of these look nearly  perfect. So where are the issues? Ah. Of course. Text. It’s always the text.   Every time. We finally left behind the age   of AI systems generating mangled, incorrect  hands, mostly, but text is still a challenge.    I am fairly sure that this is something that will  be possible just one more paper down the line.    And can you imagine what will be possible two  more papers down the line? My goodness. We can   already do a pretty good reconstruction from just  one image. Not even a set of images. One image.    This is supposed to be a failure case. If this  is a failure case, bravo, sign me up right now! So, adding a little more information to the  AI by reusing already existing images. That   was the crazy idea, which in hindsight,  makes perfect sense. What a brilliant   paper. Loving it! And one more thing. I have  a little daughter and when she was a baby,

### [5:00](https://www.youtube.com/watch?v=bD_HyxHMHPo&t=300s) Segment 2 (05:00 - 06:00)

we could not really afford a good smartphone  to take better images of her. However,   there are a lot of pictures, and I was thinking  that over my lifetime, there will surely be an   AI that will be able to upscale those not great  images to a higher resolution version. And it   should not just fill in things that could be, but  with things that are really there. And finally,   we are here. I can’t believe it. And all it  needs is one to three photos. And as a family,   we have thousands of photos of  ourselves to learn from. So good. This was Two Minute Papers with Dr. Károly  Zsolnai-Fehér. Subscribe if you wish to see more.

---
*Источник: https://ekstraktznaniy.ru/video/12974*