❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers
📝 The paper "MotionCtrl: A Unified and Flexible Motion Controller for Video Generation" is available here:
https://wzhouxiff.github.io/projects/MotionCtrl/
Try it out: https://huggingface.co/spaces/TencentARC/MotionCtrl_SVD
https://huggingface.co/spaces/TencentARC/MotionCtrl_SVD?docker=true
It is also open source - run it locally:
https://github.com/TencentARC/MotionCtrl
📝 My latest paper on simulations that look almost like reality is available for free here:
https://rdcu.be/cWPfD
Or this is the orig. Nature Physics link with clickable citations:
https://www.nature.com/articles/s41567-022-01788-5
🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Brizzee, Gaston Ingaramo, Gordon Child, Jace O'Brien, John Le, Kyle Davis, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness, Richard Putra Iskandar, Richard Sundvall, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers
Thumbnail background design: Felícia Zsolnai-Fehér - http://felicia.hu
Károly Zsolnai-Fehér's research works: https://cg.tuwien.ac.at/~zsolnai/
Twitter: https://twitter.com/twominutepapers
Оглавление (2 сегментов)
Segment 1 (00:00 - 05:00)
I cannot believe this is happening. We just talked about this paper, and this paper and they both have already been surpassed. Unbelievable. And as we go, it will just keep getting better and better. And amazing news, luckily, you can also try this one yourself for free. So, what is going on here? Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. Well, in about 2021, we had the first usable text to image AIs appear. You write a text prompt, and out comes an image. Then, this concept just got better and better, and today, these AIs are unbelievably good. So, what’s next? Well of course, not just a stationary image, but multiple images one after another. Yes, a video if you will. In November 2023, with Stable Video Diffusion, we now have completely free and a fully open source technique that will generate videos from your text prompts. And alternatively, it can also take an image from you and make it come to life. Absolutely amazing. But, this is all old news, this is all in the past. And now, just a month later, look at this. My goodness. Yes. Now we have customizable camera motion for these AI-generated videos. It can perform the usual bread and butter camera motions like pan up, pan down, zoom out, and it can even control the speed at which these movements take place. That is nice, but I have to say, I kind of expected that. But it gets better. What I did not expect is that we can also specify camera motions ourselves, and can even draw all kinds of curves that would make the scene really pop. Now, in this case, the results are clearly not perfect, but we are experienced Fellow Scholars here, and we already know that if we see results like this, and if we look just two more papers down the line, and it will get so much better. But, this is not the first technique to perform something like this. So let’s compare it to a previous work. Oh yes, as expected, unfortunately, we have lots of artifacts. The subsequent images have to follow each other naturally, but it does not give us the impression that this is a video from out there in the wild. It is missing the most important part - temporal coherence. The lifeblood of videos. Now, let’s see the new technique, and. Whoa. This is so much better. Temporal coherence for days! Still, not perfect, but this is something that could help a ton of you creative Fellow Scholars out there. But, you know what’s coming. Yes, it gets even better. But how? Wait, I hear you saying - Károly, the main issue with Stable Video Diffusion was that sometimes there was very little motion in the footage. In particular, look, here the camera moves but the subject does not move. This new technique is better in a sense that it allows you to customize the motion of the camera, but it does not allow you to move the subjects themselves. Except that VideoComposer, the previous work does attempt to do that too. So how did it do? Well, it did extremely well. But not in the way you think. For this skier, it has many artifacts, but as an unintentional side effect, look, this AI has also invented an excellent cloning machine. Not exactly what we asked for, but come on. This falling feather is also trying a little too hard to adhere to this curve, and the motion is unfortunately not believable. So, that’s it right. No video control for the subjects. Well, don’t despair, and now hold on to your papers Fellow Scholars, and look! The new technique. Look at the beautiful movement of the feather. Wow. That is a lot more believable. And so is the skier. Just one skier if I may add, but a good one at that. And all this just one more paper down the line, I can hardly believe it. And we can use this for so much. The new technique can make that wind chime sway in the wind, and we can even specify the wind for it. The paper planes can fly as we wish them to fly,
Segment 2 (05:00 - 07:00)
and we can even let those cats and zebras out for a walk. This is like an AI puppeteer who can simultaneously control the stage and the puppets at the same time. And, you know what’s coming. Yes, it gets better again! We can even combine the two together, and make the camera and the subject move at the same time. I absolutely love it. The results are clearly not perfect, but if you get something that you don’t like, don’t forget, these run for nearly free, and the whole thing just takes a few seconds. That means that you can have as many samples as you wish. Just look at the different variants on how the skateboarding teddy bear behaves, and choose the one that is most in line with your artistic vision. So good! And I bet that just two more papers down the line, and we will be able to create short movies with it just by writing text prompts and drawing motions for the actors in the movie, and it will make all of that happen. And here comes the best part - you can try it too for free! As of the making of this video, I tried this demo and it worked flawlessly. Now don’t forget, there are a lot of you Fellow Scholars, and we have crashed a great many websites before too. So once again, during the Scholarly Stampede, please be patient. And if you enjoyed this video and would like to see more content like this, make sure to subscribe and hit the bell icon to not miss out. We have some absolutely amazing papers coming up soon.