NVIDIA’s New AI: Next Level Games Are Coming!

7:12

NVIDIA’s New AI: Next Level Games Are Coming!

Two Minute Papers 09.06.2025 70 341 просмотров 3 408 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers 📝 The papers are available here: https://research.nvidia.com/labs/toronto-ai/difix3d/ https://sites.google.com/view/cast4 https://syntec-research.github.io/UVGA/ 📝 My paper on simulations that look almost like reality is available for free here: https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations: https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Benji Rabhan, B Shang, Christian Ahlin, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Michael Tedder, Owen Skarpness, Richard Sundvall, Steef, Sven Pfiffner, Taras Bobrovytsky, Thomas Krcmar, Tybie Fitzhugh, Ueli Gallizzi If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers My research: https://cg.tuwien.ac.at/~zsolnai/ X/Twitter: https://twitter.com/twominutepapers Thumbnail design: Felícia Zsolnai-Fehér - http://felicia.hu #nvidia

Оглавление (3 сегментов)

Intro

I want to create amazing virtual worlds where we can talk and play together efficiently. In the age of AI, this shouldn’t be a problem at all. Except that it still is impossible. You see, when we try to render a virtual copy of the real world efficiently. It’s not great. Breathing life into it by populating it with objects? Not a chance.

New AI

And it gets even worse when we try putting real-looking humans in it. Oh my goodness. I don’t want to talk to this person. So this seems completely hopeless. So why is everyone talking about AI this, AI that if it can’t pull this off? Well, to get the answer to that, all you have to do is look at these papers. Yes, luckily, we have three amazing works that might solve all of these three problems. Let’s start with this one. First, rendering worlds. In goes a bunch of images about the scene, but not everything. So we need a technique to learn the scene, and to be able to draw it from viewpoints we’ve never seen it from. That is really tough. It is kind of possible with NERFs and Gaussian splatting, two of the go-to techniques to perform this these days. However, not so fast. If we don’t have enough information, they can still introduce lots of noise and visual artifacts. And some of the results are just criminally bad. So, I don’t think a single new paper could fix all of that of course, so let’s see…goodness. It can! These suddenly look almost perfect. Absolutely amazing. So how is this even possible? what the heck happened here? Well, a genius idea happened. This AI technique is trained not to give us the perfect answer immediately, but to take an imperfect one, and learn to clean it up. That is nearly as good as giving the perfect answer, however, it is much simpler to pull off. And when I look at the results, with previous techniques I’m thinking, I don’t want to use any of these. And in just one paper we go from that to…wow, let’s start using this right now! So, worlds are working okay now. Great, but remember, that is just 1 out of 3. What if we want to put new things into this virtual world? Well, previous techniques are not great at reconstructing 3D information from a photo or a video of something. And this one was from just 3 years ago, and everything is so coarse here. I don’t want to play in a world like that. Now, things have gotten a bit better since, for singular objects, newer AI methods can get pretty good results. But wait until you try an entire scene of objects. They completely fall apart. Even the better ones have trouble understanding object alignment and scale. Now check this out. This one is from a different research lab. Wow! A new AI technique can do what none of the previous ones can do, and that is, take just one image, not of an object, but of an entire scene, and create a digital 3D version of it. So let’s go back to that alignment scene, and see the new one. Wow, so cool! The whole scene, with the correct scales, and nothing is intersecting each other. So, how? Well, it has two incredible ideas to make this happen. One, it has been infused by a GPT-like AI model that is meant to understand the relation of these objects. And it is doing a glorious job at that. And now let me show you the second one, my favorite. Look at this reconstruction. As we expected, positions and scales are correct in this scene, they are true to the input photos, that is excellent. But…come on man. The guitar is poking through the box. That is really difficult to guess correctly, but here is the second genius idea: you don’t need to do that. The scene is generally good, but does not obey the laws of physics. Floating, poking things. So now, hold on to your papers Fellow Scholars, and just run a simple correction step that is inspired by physics simulations, and let it sort out all of these issues. Can it? Oh my, look at that beauty! Fantastic, so, worlds and things are good. But the last puzzle piece:

Virtual Humans

people, not so much. That is the grand challenge. Previous techniques are not great at creating digital versions of real humans. Do you want to talk to someone who looks like this? I think not. Unfortunately, this problem may be just too tough. Why? You see, we are wired to look at and understand each other’s faces and gestures, so if something is off just by a tiny bit, the game is over. But it turns out, there is a solution. Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. Here is the new technique. Well, this is just so much better than the previous ones, I don’t even know what to say. Let’s look into the paper and see how they did it. Oh yes. Take a bunch of deformable Gaussians, deformable little bumps, attach them cleverly onto the geometry of the face, and this can finally capture detailed facial motion. Even up to 4k resolution. And you can throw at it some really strong gesturing, and all of those deformations are now present in your virtual version. So good! Now, not perfect. There are still some missing details, also, the teeth and eye movements are not great yet. There is still a little twitching going on too. But now let’s invoke the First Law of Papers, which says, do not look at where we are, will be two more papers down the line. So, near-perfect virtual worlds are in the works, and there is incredible progress on it. What a time to be alive! Once again, these are papers that very few people are talking about, and I am worried that if we don’t talk about them here on Two Minute Papers, no one will know about them. If you appreciate that, subscribe and hit the bell icon and you’ll see a lot more stuff like this here.

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник