# Intel's Video Game Looks Like Reality! 🌴

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=22Sojtv4gbg
- **Дата:** 05.06.2021
- **Длительность:** 8:03
- **Просмотры:** 887,342
- **Источник:** https://ekstraktznaniy.ru/video/13897

## Описание

❤️ Check out Perceptilabs and sign up for a free demo here: https://www.perceptilabs.com/papers

📝 The paper "Enhancing Photorealism Enhancement" is available here:
https://isl-org.github.io/PhotorealismEnhancement/

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Aleksandr Mashrabov, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Ramsey Elbasheer, Steef, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers

Meet and discuss your ideas with other Fellow Scholars on the Two Minute Papers Discord: https://discordapp.com/in

## Транскрипт

### <Untitled Chapter 1> []

Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. This paper is called Enhancing photorealism enhancement. Hmm! Let’s try to unpack what that exactly means. This means that we take video footage from a game, for instance, GTA 5, which is an action game where the city we can play in was modeled after real places in California. Now, as we are living the advent of neural network-based learning algorithms, we have a ton of training data at our disposal on the internet. For instance, the cityscapes dataset contains images and videos taken in 50 real cities, and it also contains annotations that describe which object is which. And the authors of this paper looked at this, and had an absolutely insane idea. And the idea is let’s learn on the cityscapes dataset what cars, cities and architecture looks like, then take a piece of video footage from the game, and translate it into a real movie. So basically something that is impossible. That is an insane idea, and when I read this paper, I thought that cannot possibly work in any case, but especially not given that the game takes place in California, and the Cityscapes dataset contains mostly footage of German cities. How would a learning algorithm pull that off? There is no way this will work. Now, there are previous techniques that attempted this, here you see a few of them. And…well, the realism is just not there, and there was an even bigger issue.

### CUT [1:45]

And that is the lack of temporal coherence. This is the flickering that you see where the AI processes these images independently and does not do that consistently. This quickly breaks the immersion and is typically a deal-breaker.

### TSIT [2:00]

And now, hold on to your papers…and let’s have a look together at the new technique. Whoa! This is nothing like the previous ones! It renders the exact same place, the exact same cars, and the badges are still correct and still refer to real-world brands. And that’s not even the best part, look! The carpaint materials are significantly more realistic, something that is really difficult to capture in a real-time rendering engine. Lots of realistic looking specular highlights off of something that feels like the real geometry of the car. Wow. Now, as you see, most of the generated photorealistic images are dimmer, and less saturated than the video game graphics. Why is that? This is because computer game engines often create a more stylized world where the saturation, haze, and bloom effects are often more pronounced. Let’s try to fight this bias where many people consider the more saturated images to be better, and focus our attention to the realism in these image pairs. While we are there, for reference, we can have a look at what the output would be if we didn’t do any of the photorealistic magic, but instead, we just tried to breathe more life into the video game footage by trying to transfer the color schemes from these real-world videos in the training set. So, only color transfer. Let’s see.

### Color Transfer [3:38]

Yes, that helps…until we compare the results with the photorealistic images synthesized by this new AI. Look. The trees don’t look nearly as realistic as the new method, and after we see the real roads, it’s hard to settle for the synthetic ones from the game. However, no one said that Cityscapes is the only dataset we can use for this method. In fact, if we still find ourselves yearning for that saturated look, we can try to plug in a more stylized dataset, and get…this! This is fantastic, because these images don’t have many of the limitations of computer graphics rendering systems. Why is that? Because, look at the grass here. In the game, it looks like a 2D texture to save resources and be able to render an image quicker. However, the new system can put more real-looking grass in there, which is a fully 3D object where every single blade of grass is considered. The most mind-blowing thing here is that this AI finally has enough generalization capabilities to learn about cities in Germany, and still be able to make convincing photorealistic images for California. The algorithm never saw California, and yet, it can recreate it from video game footage better than I ever imagined would be possible. That is mind blowing. Unreal. And if you have been holding on to your papers so far, now, squeeze that paper. Because here, we have one of those rare cases where we squeeze our papers for not a feature, but for a limitation…of sorts. You see, there are limits to this technique too. For instance since the AI was trained on the beautiful lush hills of Germany and Austria, it hasn’t really seen the dry hills of LA. So, what does it do with them? Look, it redrew the hills the only way it saw hills exist, which is, with trees. Now, we can think of this as a limitation, but also as an opportunity. Just imagine the amazing artistic effects we could achieve by playing this trick to our advantage. Also, we won’t need to create an 80% photorealistic game like this one and push it up to a 100% with the AI. We could draw not 80%, but the bare minimum, maybe only 20% for the video game, a coarse draft, if you will, and let the AI do the heavy lifting! Imagine how much modeling time we could save for artists as well. I love this. What a time to be alive! Now, all of this only makes sense for real-world use if it can run quickly. So can it? How long do we have to wait to get such a photorealistic video? Do we wait from minutes to hours? No! The whole thing runs interactively, which means that it is already usable, we can plug this into the game as a post-processing step. And remember the First Law Of Papers, which says that two more papers down the line, and it will be even better. What improvements do you expect to happen soon? And what would you use this for? Let me know in the comments below! Thanks for watching and for your generous support, and I'll see you next time!
