Microsoft's New AI: Virtual Humans Became Real! 🤯

8:23

Microsoft's New AI: Virtual Humans Became Real! 🤯

Two Minute Papers 20.08.2022 325 651 просмотров 11 793 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

❤️ Check out Runway and try it for free here: https://runwayml.com/papers/ 📝 The paper "3D Face Reconstruction with Dense Landmarks" is available here: https://microsoft.github.io/DenseLandmarks/ 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Geronimo Moralez, Gordon Child, Ivo Galic, Jace O'Brien, Jack Lukic, John Le, Jonas, Jonathan, Kenneth Davis, Klaus Busse, Kyle Davis, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Michael Albrecht, Michael Tedder, Nevin Spoljaric, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Rajarshi Nigam, Ramsey Elbasheer, Steef, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi. If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers Chapters: 0:00 - Teaser 0:19 - Use virtual worlds! 0:39 Is that a good idea? 1:28 Does this really work? 1:51 Now 10 times more! 2:13 Previous method 2:35 New method 3:15 It gets better! 3:52 From simulation to reality 4:35 "Gloves" 5:07 How fast is it? 5:35 VS Apple's ARKit 6:25 Application to DeepFakes Thumbnail background design: Felícia Zsolnai-Fehér - http://felicia.hu Károly Zsolnai-Fehér's links: Instagram: https://www.instagram.com/twominutepapers/ Twitter: https://twitter.com/twominutepapers Web: https://cg.tuwien.ac.at/~zsolnai/

Оглавление (13 сегментов)

Teaser

Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. Today, we are going to see Microsoft’s AI looking at a lot of people who don’t exist, and then, we will see that these virtual people can teach it something about real people. Now, through the power of computer graphics algorithms, we are able to create virtual

Use virtual worlds!

worlds, and of course, within those virtual worlds, virtual humans too. So, here is a wacky idea. If we have all this virtual data, why not use these instead of real photos to train a new AI to do useful things with them?

Is that a good idea?

Hmm…wait a second. Maybe this idea is not so wacky after all. Especially because we can generate as many of these virtual humans as we wish, and all this data is perfectly annotated. The location and shape of the eyebrows is known, even when they are occluded, and we know the depth and geometry of every single hair strand of the beard. If done well, there will be no issues about the identity of the subjects, or the distribution of the data. Also, we are not limited by our wardrobe or the environments we have access to. In this virtual world, we can do anything we wish. So good! But of course, here is the ultimate question that decides the fate of this project.

Does this really work?

And that question is: does this work? What is all this good for? And the crazy thing is that Microsoft’s previous AI technique could now identify facial landmarks of real people, but, it has never seen a real person before. How cool is that! But, this is a previous work, and now, a new paper has emerged, and in this one, scientists

Now 10 times more!

at Microsoft said, how about more than 10 times more landmarks? Yes, this new paper promises no less than 700. When I saw this, I thought - are you kidding? Are we going 10x just one more paper down the line?

Previous method

Well, I will believe it when I see it. Let’s see a different previous technique from just two years ago. You see that we have temporal consistency issues, in other words, there is plenty of flickering going on here, and there is one more problem: these facial expressions are really giving it a hard time.

New method

Can we really expect any improvement over these two years? Well, hold on to your papers and let’s have a look at the new method and see for ourselves. Look at that! It not only tracks a ton more landmarks, but the consistency of the results has improved a ton as well. So, it both solves a harder problem, and it also does it better than the previous technique. Wow! And all this just one more paper down the line. My goodness. I love it! I feel like this new method is the first that could even track Jim Carrey himself.

It gets better!

And, we are not done yet! Not even close - it gets even better! I was wondering if it still works in the presence of occlusions, for instance, whenever the face is covered by hair or clothing, or a flower. And, let’s see. It still works amazingly well! What about the colors? That is the other really cool thing - it can, for instance, tell us how confident it is in these predictions. Green means confident, red means that the AI has to do more guesswork, often because of these occlusions.

From simulation to reality

My other favorite thing is that this is still trained with synthetic data. In fact, it is key to its success. This is one of those success stories where training an AI in a simulated world can be brought into the real world, and it still works spectacularly. There is a lot of factors at play here, so let’s send out a huge thank you to computer graphics researchers as well for making this happen. These virtual characters could not be rendered and animated in real time without decades of incredible graphics research works. Thank you! And now comes the ultimate question: how much do we have to wait for these results?

"Gloves"

This is incredibly important. Why? Well, here is a previous technique that was amazing at tracking our hand movements. Do you see these gloves? Yes? Well, those are not gloves. This is how a previous method understands our hand motions, which is to say, that it can reconstruct them nearly perfectly. Stunning work. However, these are typically used in virtual worlds, and we had to wait for nearly an hour for such a reconstruction to happen.

How fast is it?

Do we have the same situation here? You know, 10x better results in facial landmark detection, so what is the price that we have to pay for this? One hour of waiting again? Well, not at all! If you have been holding on to your papers, now, squeeze that paper, because it is not only real time, it is more than twice as fast as real time. It can churn out 150 frames per second and it doesn’t even require your graphics card

VS Apple's ARKit

it runs on your processor. That is incredible. Here is one more comparison against the competitors. For instance, Apple’s ARKit runs on their own iPhones, and thus, they can make use of the additional depth information. That is a goldmine of information. But, this new technique doesn’t, it just takes color data, that is so much harder, but in return, it will run on any phone. Can these results compete with Apple’s solution with less data? Let’s have a look. My goodness, I love it. The results seem at the very least comparably good. That is, once again, amazing progress in just one paper.

Application to DeepFakes

So cool! Also, what I am really excited about is that variants of this technique may also be able to improve the fidelity of these DeepFake videos out there. For instance, here is an example of me becoming a bunch of characters from Game of Thrones, this previous work was incredible because it could even track where I was looking. Imagine a new generation of these tools that is able to track even more facial landmarks, and democratize creating movies, games and all kinds of virtual worlds. Yes, with some of these techniques, we can even become a painting or a virtual character as well, and even the movement of our nostrils would be transferred. What a time to be alive! So, does this get your mind going? What would you use this for? Let me know in the comments below! Thanks for watching and for your generous support, and I'll see you next time!

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник