Meta’s New AI: Outrageously Good!
6:12

Meta’s New AI: Outrageously Good!

Two Minute Papers 12.02.2025 76 904 просмотров 3 154 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers 📝 The paper "VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models" is available here: https://hila-chefer.github.io/videojam-paper.github.io/ Vs Veo2: https://x.com/TomLikesRobots/status/1888279188336963725 📝 My paper on simulations that look almost like reality is available for free here: https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations: https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Benji Rabhan, B Shang, Christian Ahlin, Gordon Child, John Le, Juan Benet, Kyle Davis, Loyal Alchemist, Lukas Biewald, Michael Tedder, Owen Skarpness, Richard Sundvall, Steef, Taras Bobrovytsky, Thomas Krcmar, Tybie Fitzhugh, Ueli GallizziIf you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers My research: https://cg.tuwien.ac.at/~zsolnai/ X/Twitter: https://twitter.com/twominutepapers Thumbnail design: Felícia Zsolnai-Fehér - http://felicia.hu

Оглавление (2 сегментов)

Segment 1 (00:00 - 05:00)

now check this out fellow Scholars my goodness these stunning results were all made by a new texto video AI called video jam now the first question is of course can it compete with open AI Sora that is a groundbreaking system that was absolutely amazing at remembering details you look at something it gets occluded and Bam it has to come back the same however it still has its flaws there are consistency issues ouch also sometimes prompt comprehension is not the best here pull-ups are supposed to happen but it doesn't so our question is can this new system out Sora well let's see together o with Sora this footage is not usable unless you are making a horror movie but with video Jam there is no contest and when trying the juggler we get very similar results and the new one is so good I have to slow it down and look at it frame by frame yes we are now at the point where we have to be pixel peeping to find out that this is not real and it has a much better understanding of motion and physics when we pour water into glass just wow look at how beautiful it is how it even models how Bubbles are formed but that is nothing compared to what you're going to see now you see I spent many years of my life writing up computer simulations for things like water drops and Crown splashes they are incredibly complex here is an earlier paper that can do it but the amount of expertise required for that is just ridiculous well not anymore look it looked at many videos from real life and it now understands that too something that took me years to understand and now you just enter a prompt and the AI just gets it and aha did you catch it we found a little weakness there is a little pop every few frames and when it comes to understanding the world and physics look I mean just try to write a computer program for blowing out these candles all the chemical reactions turbulent wind flows almost impossible but with video Jam there you go I can't believe what I am seeing here look at how life like that is wow dear fellow scanners this is 2minute papers with Dr car now it is not only smart when it comes to physics but it is also creative now let's put roller skates on a raccoon if I ask a text to image AI I often get four roller skates sometimes three okay but when I start thinking about it I ask how would this actually work in reality well let's ask video jam and now hold on to your papers fellow Scholars because this is the part where I fell off the chair when reading this paper it says Nope we are using just two roller skates and reserve the front two hands to push itself then to balance itself and for breaking as well kind of see here's the difference this is not just making images this is making videos and in the videos it actually has to work it has to dream up things that really work okay now that I calm down somewhat let's compare it to dit the technique it is based on and in research papers the result is almost always the following the new technique works better in some places and worse in others but not here I mean come on it outperforms its predecessor by a great deal on every single example I have seen that is stunning so how is all this magic possible well let's pop the hood and H interestingly the idea is surprisingly simple step number one training we give it a few video frames and then we ask now little AI predict what is about to happen now stage two video creation we let the AI generate the next frames but now we give it a little helper that is its own motion predictions it looks at what it thinks should move and where they should move and uses that to guide itself toward smoother more natural motion they call it inner guidance and there is two things you should know about it one it works two and here comes the key if you take away one thing from this video it should be this one this can be applied to any other video model yes this is like a magic ingredient in a soup and it can enhance any other soup that you might have around yummy and we also have a comparison to deep Minds V2 on the same prompt and yes we see that they are comparable but I don't think the these two works are competing

Segment 2 (05:00 - 06:00)

because VO2 might get further improved by this idea that is absolutely amazing now limitations the results are not super high resolution I think we can all agree on that I also haven't found a way to run it myself however the paper is available so I think you might see it introduced into other systems very soon and with that finally everyone will be able to become a film director you won't need tons of money and expensive equipment anymore all you will need is a Tex prompt and a vivid imagination but the AI can help with even that because it has to create things that really work kind of so what do you think what would you fellow Scholars use this for let me know in the comments below would you like to run your own copy of deep seek in the cloud cheaply without using the official app yes then try Lambda GPU Cloud they have so many powerful Nvidia gpus with tons of memory I use them regularly too seriously try it out now at lamb. com SLP papers or click the link in the description

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник