Stable Video AI Watched 600,000,000 Videos!

9:50

Stable Video AI Watched 600,000,000 Videos!

Two Minute Papers 03.12.2023 154 526 просмотров 6 547 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers Stable Video Diffusion: https://stability.ai/news/stable-video-diffusion-open-ai-video-model Emu video: https://emu-video.metademolab.com/ Emu edit: https://emu-edit.metademolab.com/ Try them out: - https://huggingface.co/spaces/multimodalart/stable-video-diffusion - https://emu-video.metademolab.com/#/demo Run it locally: https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt https://github.com/Stability-AI/generative-models Guide: https://www.reddit.com/r/StableDiffusion/comments/180smii/stabilityai_researcher_shares_tips_on_how_to_run/ Video credits: https://twitter.com/DiffusionPics/status/1727316123379704235 https://twitter.com/DiffusionPics/status/1726935847113806133 https://twitter.com/fofrAI/status/1727097718873440473 https://twitter.com/PurzBeats/status/1727155328482226458 https://twitter.com/multimodalart/status/1727161210812928385 https://twitter.com/thibaudz/status/1727078190521180269 https://twitter.com/sumith1896/status/1727046123742007455 https://twitter.com/c0nsumption_/status/1727114628021285356 https://x.com/purzbeats/status/1728871517424148750?s=46 https://twitter.com/maxescu/status/1727673742568952216 https://x.com/skirano/status/1728167226295927227?s=46 https://twitter.com/fofrAI/status/1727104190135369797 Chatbot arena leaderboard: https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard 📝 My latest paper on simulations that look almost like reality is available for free here: https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations: https://www.nature.com/articles/s41567-022-01788-5 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bret Brizzee, Bryan Learn, B Shang, Christian Ahlin, Gaston Ingaramo, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Kenneth Davis, Klaus Busse, Kyle Davis, Lukas Biewald, Martin, Matthew Valle, Michael Albrecht, Michael Tedder, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Rajarshi Nigam, Ramsey Elbasheer, Richard Putra Iskandar, Richard Sundvall, Steef, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi. If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers Thumbnail background design: Felícia Zsolnai-Fehér - http://felicia.hu Károly Zsolnai-Fehér's research works: https://cg.tuwien.ac.at/~zsolnai/ Twitter: https://twitter.com/twominutepapers

Оглавление (2 сегментов)

Segment 1 (00:00 - 05:00)

finally it is here from today we can all become film directors yes text to video and image to video that is open source and free for all of us and it can even make images of memes come alive this is stable video which has studied 600 million videos and we have this two and this two oh my three amazing papers yummy so what is going on here well simple you just write a piece of text and stable video can generate a video for you in about 2 to 3 minutes it compares favorably against the competition at this moment I say at this moment because this result was recorded at a particular point in time and these systems improve so rapidly that for instance Runway may be way better by the time you see this comparison but it doesn't end here not even close there is another text to video AI that you can kind of try right now and there is even more in fact there is so much going on I don't even know where to start dear fellow Scholars this is two minute papers with Dr car here so first stable video this was trained on about 600 million videos and now can generate new ones for you it is free and open source however you still need some computational resources to run it I'll put potential places that can run it for you in the video description if you found some other place where other fellow Scholars can run it for free please leave a comment about it thank you it takes approximately two to three minutes to create a video and there is a lot to like here finally an open-source solution this means that you will soon be able to run this on the phone in your pocket as freely As You Wish glorious however it is not perfect not even close sometimes you get no real animation but instead a camera panning around also you probably already inferred that from these results but it cannot generate longer videos but that's not all is generated videos also typically showcase not too much motion third you know the deal don't expect good text outputs from it not yet anyway and fourth it is a bit of a chunker what does that mean well you need a lot of video memory to perform this I am hearing 40 GB although there is already a guide to get it down to under 20 or maybe even 10 GB link is in the description from seeing the nature of these limitations my guess is that the memory requirements will be cut down substantially very soon however there are more tools coming up in the meantime here is emu video this is incredible look it is so good at generating natural phenomena and it even has a hint of creativity wow fantastic results and the paper showcases this which is a sight to behold goodness are you seeing what I am seeing this is a user study where humans look at the results and whatever other technique you see this compared against it has a win rate often in the 80% region against image and video here is what image and video look like and that is definitely one of the best ones out there and now this new one still better wow but it gets better just creating high quality results is not enough just consider a technique that always gives you a high quality video perhaps always the same and ignores your prompts that is a high quality result however faithfulness to the PRS also needs to be measured and on that this new technique has no equal nothing is even close wow and fantastic news you can kind of try this technique out for free in a website right now the link is obviously in the description waiting for you fellow Scholars you can assemble these text Proms and see immediately what the system will do with them I love the creativity here every solution is at least pretty good in my opinion and some of the solutions are just excellent for instance this one ha

Segment 2 (05:00 - 09:00)

so good you can also look at some images here and perform image to video These really came to life here so good or search for a gallery of text to video results I loved the robots here but when I looked for scholarly content nothing H we need more scholarly content well maybe next time also this is a great paper so it contains a study that is so much more detailed than this they also look at sharpness smoothness amount of motion yes you remember from the stable video project that this is super important and object consistency as well now not even this one is perfect the resolution of these videos is 512 by 512 not huge but this is almost guaranteed to be improved just one more paper down the line also this is not open source not at the moment anyway now why is it important if it is free and open source like the previous stable video well have a look at this I love this image so why is this interesting well have a look and you see here that the best performing large language models are all proprietary these are closed models however there are other language models that are nearly as good just a step or two behind but these are free and open source so this means that intelligence is not in the hands of just one company but a nearly as good intelligence you can run yourself on your laptop and soon on your smartphone too just imagine if the best model out there is unwilling to help you or starts hallucinating in that case you would have no other choice but with open- Source models this will never happen there is always going to be a kind Little Robot helping you and this is the importance of Open Source models and if you think that we are done well fellow scholar hold on to your papers for the third amazing paper for today emu edit this helps us edit images iteratively the iterative part is key here this means that we can start from an image that we like something that we got from a text to image AI perhaps and then it is rarely the case that everything comes out exactly how we envisioned so from now on not a problem at all we just add subsequent instructions and much of the image will remain only the parts that we wish to change will be replaced so if you need the same emu the same background but make it a fireman there we go and oh my look finally scholarly content good and when compared to the competition this one is also so far ahead of them goodness look at that instruct PX to PS is just from a year ago and Magic Brush is from less than 6 months ago and both of them are outperformed significantly here there are other cases which are a bit closer but I still prefer the new one here so three amazing use cases three amazing papers I hope that you share my feeling that this is an incredible time to be alive research breakthroughs are happening every week what a time to be alive subscribe and hit the Bell icon if you wish to see more if you're looking for inexpensive Cloud gpus for AI Lambda now offers the best prices in the world for G GPU Cloud compute no commitments or negotiation required just sign up and launch an instance and hold on to your papers because with the Lambda GPU Cloud you can now get on demand h100 instances for just1 199 per hour yes $199 and they are one of the first Cloud providers to offer publicly available on demand h100 accs did I mention they also offer persistent storage so join researchers at organizations like apple MIT and ctech in using Lambda Cloud instances workstations or servers make sure to go to lamb. com slapers to sign up for one of their amazing GPU instances today

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник