# Microsoft’s New AI: Ray Tracing 16,000,000 Images!

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=qYJk9l65eJ8
- **Дата:** 06.06.2025
- **Длительность:** 6:13
- **Просмотры:** 115,135

## Описание

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambda.ai/papers

Guide for using DeepSeek on Lambda:
https://docs.lambdalabs.com/education/large-language-models/deepseek-r1-ollama/?utm_source=two-minute-papers&utm_campaign=relevant-videos&utm_medium=video

📝 The paper "RenderFormer: Transformer-based Neural Rendering of Triangle Meshes with Global Illumination" is available here:
https://microsoft.github.io/renderformer/

📝 Our neural rendering paper "Gaussian Material Synthesis" is available here:
https://users.cg.tuwien.ac.at/zsolnai/gfx/gaussian-material-synthesis/


🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Benji Rabhan, B Shang, Christian Ahlin, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Michael Tedder, Owen Skarpness, Richard Sundvall, Steef, Taras Bobrovytsky, Thomas Krcmar, Tybie Fitzhugh, Ueli GallizziIf you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers

My research: https://cg.tuwien.ac.at/~zsolnai/
X/Twitter: https://twitter.com/twominutepapers
Thumbnail design: Felícia Zsolnai-Fehér - http://felicia.hu

## Содержание

### [0:00](https://www.youtube.com/watch?v=qYJk9l65eJ8) Segment 1 (00:00 - 05:00)

I am kind of stunned to say this is a research work I did not expect at all and least of all from Microsoft but they absolutely nailed neural rendering. I can't believe what I am seeing here and I think if you watch this you'll be just as surprised. I hope so. What is that? Well, here rendering an image means doing a proper light simulation which creates these beautiful results but also takes extremely long. Why? Because you need to shoot millions and millions of rays of light into the scene and initially this leads to a lot of noise which over time cleans up. It may take from minutes to weeks to clean up. If we could get an algorithm that can do this instantly, that would be an absolute gamecher. So instead, we replace this whole renderer with a neural network that we teach to perform the rendering itself. Can we sort of I had the honor and the luck of writing one of the early papers on neural rendering. Now check this out. This is the real simulation and this is the guess of the neural network. They are pretty close. So that is good news. However, the crazy part is that the neural network took not weeks but just a few milliseconds to do this. It can do this 500 times per second easily faster than real time. However, this technique was constrained to a particular scene and a particular view of the scene. So that's quite limited. And then now 7 years later, scientists at Microsoft had a crazy idea. They say we have these transformer neural networks that also power Chad GPT. And these are great at processing tokens. So let's just break down both the camera, the objects, and the whole scene into tiny little tokens and throw them into a transformer neural network. Do this for about 16 million images. and then see what happens. So, level one out of four. Goodness, the initial results are fantastic. Still images, but you see some color bleeding along the back plate in the background of the Cornell box. So, this is proper light transport right there. Variety of objects, glossy reflections. So far, so good. Level two out of four. Oh my, you can even edit the scene. Look, the material properties are being edited here, namely the roughness as we go between a mirror-l like material to perfectly diffuse. Or you can also change the lighting of the scene and see how beautifully it recalculates the appearance of the new scene. Imagine adjusting real world lighting with a slider. Reality can now be photoshopped. Level three out of four. And now things get crazy. I don't know how much better things can become. And can you believe this? It even supports proper animated scenes. And now hold on to your papers, fellow scholars, because now you will hear how long it takes for this to give you a real looking light transport simulation. Days per image? Nope. Seconds per image? Also, nope. It takes 76 milliseconds for one image. That is interactive. So you can even play with it. Not quite real time yet, but that is coming just one more paper down the line easily. An absolutely incredible contribution. Wow. And yes, level four is coming in a moment. Dear fellow scholars, this is two minute papers with Dr. Koa Eher. So here's the key. It also works on a variety of scenes, not just on ones that it had already seen before. That is the gamecher part. Now this all makes it sound easy. But there is more to it. For instance, there is not just one neural network but two. One for view dependent and one for view independent stuff. So one can move the camera around if you will. But wait a second. I'm thinking does this all mean that we can give it the two minute papers special? You know what that is? Of course. Physics simulations. Not a chance, right? Well, level four. Now, check this out. Holy mother of papers. Physics simulations rendered through a neural light transport algorithm interactively. That is a huge achievement. Perhaps even a historic moment. I am absolutely stunned. And yes, very soon likely just one more paper down the line and we are going to have incredible physics

### [5:00](https://www.youtube.com/watch?v=qYJk9l65eJ8&t=300s) Segment 2 (05:00 - 06:00)

simulations like this where both the physics and the rendering will be done by an AI technique and all this in real time. Both how things move and how things look will be computed by an AI. I got to say I cannot wait. And yet, once again, I don't see anyone talking about this. This might be the only place on the internet where you hear a scientist talking about this incredible work. This is proper AI research work and not the daily clickbait nonsense that your vacuum cleaner is forming an underground rebellion in your kitchen as some of the tech articles suggest. Here you see me running the full Deepseek AI model through Lambda GPU Cloud. 671 billion parameters running super fast and super reliably. This is insane. I love it and I use it on a regular basis. Lambda provides you with powerful NVIDIA GPUs to run your own chatbots and experiments. Seriously, try it out now at lambda. ai/papers AI/papers or click the link in the description.

---
*Источник: https://ekstraktznaniy.ru/video/12337*