This AI Learned To Create Dynamic Photos! 🌁
7:02

This AI Learned To Create Dynamic Photos! 🌁

Two Minute Papers 26.01.2021 152 915 просмотров 9 238 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
❤️ Check out Weights & Biases and sign up for a free demo here: https://www.wandb.com/papers ❤️ Their report on this paper is available here: https://wandb.ai/wandb/xfields/reports/-Overview-X-Fields-Implicit-Neural-View-Light-and-Time-Image-Interpolation--Vmlldzo0MTY0MzM 📝 The paper "X-Fields: Implicit Neural View-, Light- and Time-Image Interpolation" is available here: http://xfields.mpi-inf.mpg.de/ 📝 Our paper on neural rendering (and more!) is available here: https://users.cg.tuwien.ac.at/zsolnai/gfx/gaussian-material-synthesis/ 📝 Our earlier paper with high-resolution images for the caustics is available here: https://users.cg.tuwien.ac.at/zsolnai/gfx/adaptive_metropolis/ 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Haro, Alex Serban, Alex Paden, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bruno Mikuš, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Lau, Eric Martel, Gordon Child, Haris Husic, Jace O'Brien, Javier Bustamante, Joshua Goller, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Ramsey Elbasheer, Robin Graham, Steef, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh. If you wish to support the series, click here: https://www.patreon.com/TwoMinutePapers Thumbnail background image credit: https://pixabay.com/images/id-820011/ Károly Zsolnai-Fehér's links: Instagram: https://www.instagram.com/twominutepapers/ Twitter: https://twitter.com/twominutepapers Web: https://cg.tuwien.ac.at/~zsolnai/

Оглавление (10 сегментов)

Introduction

Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. Approximately 5 months ago, we talked about a technique called Neural Radiance Fields, or NERF in short, where the input is the location of the camera and an image of what the camera sees, we take a few of those, give them to a neural network to learn them, and synthesize new, previously unseen views of not just the materials in the scene, but the entire scene itself. In short, we take a few samples, and the neural network learns what should be there between the samples. In comes a non-continuous data, a bunch of photos, and out goes a continuous video where the AI fills in the data between these samples. With this, we can change the view direction, but only that!

Introduction

Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. Approximately 5 months ago, we talked about a technique called Neural Radiance Fields, or NERF in short, where the input is the location of the camera and an image of what the camera sees, we take a few of those, give them to a neural network to learn them, and synthesize new, previously unseen views of not just the materials in the scene, but the entire scene itself. In short, we take a few samples, and the neural network learns what should be there between the samples. In comes a non-continuous data, a bunch of photos, and out goes a continuous video where the AI fills in the data between these samples. With this, we can change the view direction, but only that!

Potential Variables

This concept can also be used for other variables. For instance, this work is able to change the lighting, but only the lighting. By the way, this is from long ago, from around Two Minute Papers episode number 13, so our seasoned Fellow Scholars know that this was almost 500 episodes ago. Or, the third potential variable is time. With this AI-based physics simulator, we can advance the time, and the algorithm would try to guess how a piece of fluid would evolve over time. This was amazing, but as you might have guessed, we can advance time, but only the time. And this was just a couple of examples from a slew of works that are capable of doing one, or at most, two of these. These are all amazing techniques, but they offer separate features. One can change the view, but nothing else, one for the illumination, but nothing else, and one for time, but nothing else. With the advent of neural network-based learning algorithms, I wonder if it is possible to create an algorithm that does all three? Or is this just science fiction? Well, hold on to your papers, because with this new work that goes by the name X-Fields, we can indeed change the time…and the view direction... and the lighting separately. Or, even better, do all three at the same time.

Potential Variables

This concept can also be used for other variables. For instance, this work is able to change the lighting, but only the lighting. By the way, this is from long ago, from around Two Minute Papers episode number 13, so our seasoned Fellow Scholars know that this was almost 500 episodes ago. Or, the third potential variable is time. With this AI-based physics simulator, we can advance the time, and the algorithm would try to guess how a piece of fluid would evolve over time. This was amazing, but as you might have guessed, we can advance time, but only the time. And this was just a couple of examples from a slew of works that are capable of doing one, or at most, two of these. These are all amazing techniques, but they offer separate features. One can change the view, but nothing else, one for the illumination, but nothing else, and one for time, but nothing else. With the advent of neural network-based learning algorithms, I wonder if it is possible to create an algorithm that does all three? Or is this just science fiction? Well, hold on to your papers, because with this new work that goes by the name X-Fields, we can indeed change the time…and the view direction... and the lighting separately. Or, even better, do all three at the same time.

Results

Wo-hoo! Look at how we can play with the time back and forth and set the fluid levels as we desire, that is the time part, and we can also play with the other two parameters as well at the same time. But still, the results that we see here can range from absolutely amazing, to trivial, depending on just one factor. And that factor is, how much training data was available for the algorithm. Neural networks typically require loads of training data to learn a new concept. For instance, if we wish to teach to a neural network what a cat is, we have to show it thousands and thousands of images of cats. So, how much training data is needed for this? And now, hold on to your papers, and…whoa…look at these 5 dots here. Do you know what this means? It means that all the AI saw was five images, that is five samples from the scene with different light positions, and it could fill in all the missing details with such accuracy that we can create this smooth and creamy transition. It almost feels like we have made at least a 100 photographs of the scene. And all this from 5 input photos. Absolutely amazing. Now, here is my other favorite example. I am a light transport simulation researcher by trade, so by definition, I love caustics. A caustic is a beautiful phenomenon in nature where curved surfaces reflect or refract light, and concentrate it to a relatively small area. I hope that you are not surprised when I say that it is the favorite phenomenon of most light transport researchers. And, just look at how beautifully it deals with it. You could take any of these intermediate, AI-generated images and sell them as real ones and I doubt anyone would notice. So, it does three things that previous techniques could do one by one, but really, how does

Results

Wo-hoo! Look at how we can play with the time back and forth and set the fluid levels as we desire, that is the time part, and we can also play with the other two parameters as well at the same time. But still, the results that we see here can range from absolutely amazing, to trivial, depending on just one factor. And that factor is, how much training data was available for the algorithm. Neural networks typically require loads of training data to learn a new concept. For instance, if we wish to teach to a neural network what a cat is, we have to show it thousands and thousands of images of cats. So, how much training data is needed for this? And now, hold on to your papers, and…whoa…look at these 5 dots here. Do you know what this means? It means that all the AI saw was five images, that is five samples from the scene with different light positions, and it could fill in all the missing details with such accuracy that we can create this smooth and creamy transition. It almost feels like we have made at least a 100 photographs of the scene. And all this from 5 input photos. Absolutely amazing. Now, here is my other favorite example. I am a light transport simulation researcher by trade, so by definition, I love caustics. A caustic is a beautiful phenomenon in nature where curved surfaces reflect or refract light, and concentrate it to a relatively small area. I hope that you are not surprised when I say that it is the favorite phenomenon of most light transport researchers. And, just look at how beautifully it deals with it. You could take any of these intermediate, AI-generated images and sell them as real ones and I doubt anyone would notice. So, it does three things that previous techniques could do one by one, but really, how does

Thin Geometry

its quality compare to these previous methods? Let’s see how it does on thin geometry, which is a notoriously difficult case for these methods. Here is a previous one. Look. The thick part is reconstructed correctly, however, look at the missing top of the grass blade. Yup, that’s gone. A different previous technique by the name Local Light Field Fusion not only missed the top as well, but also introduced halo-like artifacts to the scene. And, as you see with this footage, the new method solves all of these problems really well, and is quite close to the true reference footage that we kept hidden from the AI. Perhaps the best part is that it also has an online demo that you can try right now, so make sure to click the link in the video description to have a look. Of course, not even this technique is perfect, there are cases where it might confuse the foreground with the background, and we are still not out of the water when it comes to thin geometry. Also, an extension that I would love to see is changing material properties. Here, you see some results from our earlier paper on neural rendering where we can change the material properties of this test object, and get a near-perfect photorealistic image

Thin Geometry

its quality compare to these previous methods? Let’s see how it does on thin geometry, which is a notoriously difficult case for these methods. Here is a previous one. Look. The thick part is reconstructed correctly, however, look at the missing top of the grass blade. Yup, that’s gone. A different previous technique by the name Local Light Field Fusion not only missed the top as well, but also introduced halo-like artifacts to the scene. And, as you see with this footage, the new method solves all of these problems really well, and is quite close to the true reference footage that we kept hidden from the AI. Perhaps the best part is that it also has an online demo that you can try right now, so make sure to click the link in the video description to have a look. Of course, not even this technique is perfect, there are cases where it might confuse the foreground with the background, and we are still not out of the water when it comes to thin geometry. Also, an extension that I would love to see is changing material properties. Here, you see some results from our earlier paper on neural rendering where we can change the material properties of this test object, and get a near-perfect photorealistic image

Conclusion

of it in about 5 milliseconds per image. I would love to see it combined with a technique like this one, and while it looks super challenging, it is easily possible that we will have something like that within 2 years. The link to our neural rendering paper and its source code is also available in the video description. What a time to be alive! Thanks for watching and for your generous support, and I'll see you next time!

Conclusion

of it in about 5 milliseconds per image. I would love to see it combined with a technique like this one, and while it looks super challenging, it is easily possible that we will have something like that within 2 years. The link to our neural rendering paper and its source code is also available in the video description. What a time to be alive! Thanks for watching and for your generous support, and I'll see you next time!

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник