# AI-Based 3D Pose Estimation: Almost Real Time!

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=F84jaIR5Uxc
- **Дата:** 23.02.2019
- **Длительность:** 2:56
- **Просмотры:** 74,126

## Описание

📝 The paper "3D Human Pose Machines with Self-supervised Learning" and its source code is available here:
https://arxiv.org/abs/1901.03798
http://www.sysu-hcp.net/3d_pose_ssl/
https://github.com/chanyn/3Dpose_ssl.git

❤️ Pick up cool perks on our Patreon page: https://www.patreon.com/TwoMinutePapers

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
313V, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Anthony Vdovitchenko, Brian Gilman, Christian Ahlin, Christoph Jadanowski, Claudio Fernandes, Dennis Abts, Eric Haddad, Eric Martel, Evan Breznyik, Geronimo Moralez, Jason Rollins, Javier Bustamante, John De Witt, Kaiesh Vohra, Kasia Hayden, Kjartan Olason, Lorin Atzberger, Marcin Dukaczewski, Marten Rauschenberg, Maurits van Mastrigt, Michael Albrecht, Michael Jensen, Morten Punnerud Engelstad, Nader Shakerin, Owen Campbell-Moore, Owen Skarpness, Raul Araújo da Silva, Richard Reis, Rob Rowe, Robin Graham, Ryan Monsurate, Shawn Azman, Steef, Steve Messina, Sunil Kim, Thomas Krcmar, Torsten Reil, Zach Boldyga, Zach Doty.
https://www.patreon.com/TwoMinutePapers

Splash screen/thumbnail design: Felícia Fehér - http://felicia.hu

Károly Zsolnai-Fehér's links:
Facebook: https://www.facebook.com/TwoMinutePapers/
Twitter: https://twitter.com/karoly_zsolnai
Web: https://cg.tuwien.ac.at/~zsolnai/

## Содержание

### [0:00](https://www.youtube.com/watch?v=F84jaIR5Uxc) Segment 1 (00:00 - 02:00)

Dear Fellow Scholars, this is Two Minute Papers with Károly Zsolnai-Fehér. This episode is about a really nice new paper on pose estimation. Pose estimation means that we have an image or video of a human as an input, and the output should be, this skeleton that you see here that shows us what the current position of this person is. Sounds alright, but what are the applications of this, really? Well, it has a huge swath of applications, for instance, many of you often hear about motion capture for video games and animation movies, but it is also used in medical applications for finding abnormalities in a patient’s posture, animal tracking, understanding sign language, pedestrian detection for self-driving cars, and much, much more. So if we can do something like this in real time, that’s hugely beneficial for many applications. However, this is a very challenging task, because humans have a large variety of appearances, images come in all kinds of possible viewpoints, and as a result, the algorithm has to deal with occlusions as well. This is particularly hard, have a look here. In these two cases, we don’t see the left elbow, so it has to be inferred from seeing the remainder of the body. We have the reference solution on the right, and as you see here, this new method is significantly closer to it than any of the previous works. Quite remarkable. The main idea in this paper is that it works out the poses both in 2D and 3D and contains neural network that can convert to both directions between these representations while retaining the consistencies between them. First, the technique comes up with an initial guess, and follows up by using these pose transformer networks to further refine this initial guess. This makes all the difference. And not does it lead to high-quality results, but it also takes way less time than previous algorithms — we can expect to obtain a predicted pose in about 51 milliseconds, which is almost 20 frames per second. This is close to real time, and is more than enough for many of the applications we’ve talked about earlier. In the age of rapidly improving hardware, these are already fantastic results both in terms of quality and performance, and not only the hardware, but the papers are also improving at a remarkable pace. What a time to be alive. The paper contains an exhaustive evaluation section, it is measured against a variety of high-quality solutions, I recommend that you have a look in the video description. I hope nobody is going to install a system in my lab that starts beeping every time I slouch a little, but I am really looking forward to benefitting from these other applications. Thanks for watching and for your generous support, and I'll see you next time!

---
*Источник: https://ekstraktznaniy.ru/video/14355*