# AI Learns Noise Filtering For Photorealistic Videos | Two Minute Papers #215

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=YjjTPV2pXY0
- **Дата:** 17.12.2017
- **Длительность:** 4:16
- **Просмотры:** 64,496
- **Источник:** https://ekstraktznaniy.ru/video/14539

## Описание

The paper "Interactive Reconstruction of Monte Carlo Image Sequences using a Recurrent Denoising Autoencoder" is available here:
http://research.nvidia.com/publication/interactive-reconstruction-monte-carlo-image-sequences-using-recurrent-denoising

The paper with the notoriously difficult "Spheres" scene:
https://users.cg.tuwien.ac.at/zsolnai/gfx/adaptive_metropolis/

We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Andrew Melnychuk, Brian Gilman, Christian Ahlin, Christoph Jadanowski, Dave Rushton-Smith, Dennis Abts, Eric Haddad, Esa Turkulainen, Evan Breznyik, Kaben Gabriel Nanlohy, Malek Cellier, Marten Rauschenberg, Michael Albrecht, Michael Jensen, Michael Orenstein, Raul Araújo da Silva, Robin Graham, Steef, Steve Messina, Sunil Kim, Torsten Reil.
https://www.patreon.com/TwoMinutePapers

One-time payments:
PayPal: https://www.paypal.me/TwoMinutePapers
Bitcoin: 13hhmJnLEzwXgmgJN7RB6bWVdT7WkrFAHh
Ethereum: 0x002BB163DfE89B7aD0712846F1a1E5

## Транскрипт

### <Untitled Chapter 1> []

Dear Fellow Scholars, this is Two Minute Papers with Károly Zsolnai-Fehér. This is another one of those amazing papers that I am really excited about. And the reason for that is that this is in the intersection of computer graphics and machine learning, which, as you know, is already enough to make me happy, but when I've first seen the quality of the results, I was delighted to see that it delivered exactly what I was hoping for. Light simulation programs are an important subfield of computer graphics where we try to create a photorealistic image of a 3D digital scene by simulating the path of millions and millions of light rays. First, we start out with a noisy image, and as we compute more paths, it slowly clears up. However, it takes a very long time to get a perfectly clear image, and depending on the scene and the algorithm, it can take from minutes to hours. In an earlier work, we had a beautiful but pathological scene that took weeks to render on several machines, if you would like to hear more about that, the link is available in the video description. So in order to alleviate this problem, many noise filtering algorithms surfaced over the years. The goal of these algorithms is that instead of computing more and more paths until the image clears up, we stop at a noisy image and try to guess what the final image would look like. This often happens in the presence of some additional depth and geometry information, additional images that are often referred to as feature buffers or auxiliary buffers. This information helps the noise filter to get a better understanding of the scene and produce higher quality outputs. Recently, a few learning-based algorithms emerged with excellent results. Well, excellent would be an understatement since these can take an extremely noisy image that we rendered with one ray per pixel. This is as noisy as it gets I'm afraid, and it is absolutely stunning that we can still get usable images out of this. However, these algorithms are not capable of dealing with sequences of data and are condemned to deal with each of these images in isolation. They have no understanding of the fact that we are dealing with an animation.

### Reconstruction Classroom [2:01]

What does this mean exactly? What this means is that the network has no memory of how it dealt with the previous image, and if we combine it with the fact that a trace amount of noise still remains in the

### Reconstruction Crytek Sponza Glossy materials [2:15]

images, we get a disturbing flickering effect. This is because the remainder of the noise is different from image to image. This technique uses a Recurrent Neural Network, which is able to deal with sequences of data, for instance, in our case, video.

### Reconstruction Crytek Sponza Specular materials [2:28]

It remembers how it dealt with the previous images a few moments ago, and, as a result, it can adjust and produce outputs that are temporally stable. Computer graphics researchers like to call this spatiotemporal filtering. You can see in this camera panning experiment how much more smoother this new technique is. Let's try the same footage slowed down and see if we get a better view of the flickering. Yup, all good! Recurrent Neural Networks are by no means easy to train and need quite a few implementation details to get it right, so make sure to have a look at the paper for details. Temporally coherent light simulation reconstruction of noisy images from one sample per pixel. And for video. This is insanity. I would go out on a limb and say that in the very near future, we'll run learning-based noise filters that take images that are so noisy, they don't even have one ray sample per pixel. Maybe one every other pixel or so. This is going to be the new milestone. If someone told me that this would be possible when I started doing light transport as an undergrad student, I wouldn't have believed a word of it. Computer games, VR and all kinds of real-time applications will be able to get photorealistic light simulation graphics in real time. And, temporally stable. I need to take some time to digest this. Thanks for watching and for your generous support, and I'll see you next time!