The Next Level of AI Video Games Is Here!
6:14

The Next Level of AI Video Games Is Here!

Two Minute Papers 23.09.2025 75 110 просмотров 3 237 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
❤️ Check out Vast.ai and run DeepSeek or any AI project: https://vast.ai/papers 📝 Magica 2 is available here: https://blog.dynamicslab.ai/ Try it out: https://demo.dynamicslab.ai/chaos 📝 My paper on simulations that look almost like reality is available for free here: https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations: https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Benji Rabhan, B Shang, Christian Ahlin, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Michael Tedder, Owen Skarpness, Richard Sundvall, Steef, Sven Pfiffner, Taras Bobrovytsky, Thomas Krcmar, Tybie Fitzhugh, Ueli Gallizzi If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers My research: https://cg.tuwien.ac.at/~zsolnai/ X/Twitter: https://twitter.com/twominutepapers Thumbnail design: Felícia Zsolnai-Fehér - http://felicia.hu

Оглавление (2 сегментов)

Segment 1 (00:00 - 05:00)

Check out this amazing new AI technique,  Magica 2 where an image goes in, your image,   and a playable video game comes out. And then we  are going to explode this person for no reason. And if you look back just one year ago, this  was possible with Google DeepMind’s Genie 2   and this is way better than that. I’ll tell you  about the differences with Genie 3 in a moment.    And as of the making of this video, if  everything goes well and if we haven’t   crashed their servers yet, you can  hopefully try it out too, even on your   phone. That’s what they say. Note that we are  not affiliated with this company in any way. Okay, now this concept is amazing because this  image can be a real video game, something like   cyberpunk. Or, even a painting, man, let’s  take starry night and look into that, I’d love   to see that. Wow. It’s really amazing to see this  painting come alive as a real world. Now, this is   not even close to perfect, as we go on for longer,  it starts to become less and less like itself. However, it gets better. You can have a drawing  of yours also come alive. This one is way more   consistent, although the AI did not have that much  to do, so let’s give it something more difficult.    Oh yes, this is going to be a city, not just a  pier. An interesting, quirky little city made of   paper and scribbles. Really cool! Once again, it  start to become a bit less like itself over time. This effect gets even more apparent with  this pencil sketch, okay let’s enter this   world. So far so good, but it’s a bit like  a guided tour in IKEA. Stay on the arrows,   you’re fine. Wander off…and you’ll  never be seen again my friend. But even for all the good and bad, this  really shows how incredibly quickly the AI   space improves over time. Now I will note that  I have found no research paper for this work.    I’ll let it slide this time, but only because  it is a brilliant showcase of how far we’ve   come in less than one year. Okay, as promised,  let’s talk differences. Dear Fellow Scholars,   this is Two Minute Papers with Dr.   Károly Zsolnai-Fehér. Dr. Carroll. Google DeepMind’s Genie 2 was a bit like  a goldfish trying to direct a movie - it   forgets what happened three seconds ago,  so every new frame is a brand new plot.    Genie 3 is like a dog dreaming.   It runs, barks, chases something,   and for a minute or two it looks visually  consistent. We don’t know how long. And this one promises 10 minutes.   Interaction latency for Genie 3,   they say instant, but I cannot know for  sure as they did not offer me to try it,   but for this one, 200 milliseconds. Not for  the pros out there who beat Silksong with one   hand tied behind their backs, no. But for a tech  demo as a stepping stone, sounds really great.    Genie 3 runs on Google’s datacenter somewhere on  Earth, this one runs on a single consumer GPU. We are wise Fellow Scholars here,  so we will all take this with a   grain of salt as we have no research  paper yet, but I’ll keep an eye out. Now, before we try it together,  I won’t leave you hanging,   the architecture is probably somewhat  similar to what Genie 2 did, which is   the following. It was a diffusion world  model that turns video into a simpler form,   then it predicts the next frame step-by-step  using past frames and your actions,   kind of like how a text model predicts the next  word in your sentence. So simpler, it is like a   storyteller with a flipbook - you tell it what  the hero does next, and it quickly sketches the   next page based on the previous ones, flipping  forward frame by frame to bring the story to life. And now, you can also try it through the  link in the description. I hope. For me,   it did something, but it was not super fun.   I just press and press the buttons,   sometimes something happens, maybe, most  of the time, not so much. But I looked   around and people reported that it works for  them, I hope you will also have it better.    Now this other game worked much better, I  could move the camera, walk around, jump,   attack…kind of. Uh…sir? Sir! Are you okay sir?   Okay, this valiant knight has clearly eaten   something he shouldn’t have, and I don’t  wanna be around to find out what it was. Whew! That was close. Okay, now this work however,  does one thing very well. And that is…it exists,   and you know what that means. The First Law of  Papers says that two more papers down the line,

Segment 2 (05:00 - 06:00)

it will be improved a great deal. Just  think about the fact that 1 year ago,   we had Genie 2, low quality footage, seconds  of memory if that, and only platformers, the   same game basically. And now, up to 10 minutes of  memory, in much higher quality. More variety too. Now, limitations. They say character control  is not yet perfect, with certain movements like   right turns occasionally showing reduced  responsiveness. Well, you saw it, for me,   reduced responsiveness was flowery words for  not working at all. But try it out yourself,   and let me know in the comments how  it went. Remember, low expectations.    This is a super early tech demo of  something that was impossible last year.

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник