# AI Makes Near-Perfect DeepFakes in 40 Seconds! 👨

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=iXqLTJFTUGc
- **Дата:** 07.05.2021
- **Длительность:** 7:12
- **Просмотры:** 317,212
- **Источник:** https://ekstraktznaniy.ru/video/13917

## Описание

❤️ Check out Perceptilabs and sign up for a free demo here: https://www.perceptilabs.com/papers

📝 The paper "Iterative Text-based Editing of Talking-heads Using Neural Retargeting" is available here:
https://davidyao.me/projects/text2vid/

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Aleksandr Mashrabov, Alex Haro, Alex Serban, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Haddad, Eric Martel, Gordon Child, Haris Husic,  Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael Albrecht, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Ramsey Elbasheer, Robin Graham, Steef, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers

Meet and discuss your ideas with other Fellow Sch

## Транскрипт

### <Untitled Chapter 1> []

dear fellow scholars this is two minute papers with dr carol jonaife imagine that you are a film critic and

### Film Grades [0:07]

you are recording a video review of a movie but unfortunately you are not the best kind of movie critic and you record it before watching the movie but here is the problem you don't really know if it's going to be any good so you record this i'm going to give hereditary a b minus so far so good nothing too crazy going on here however you go in watch the movie and it turns out to be amazing so what do we do if we don't have time to re-record the video well we grab this ai type in the new text and it will give us this i'm gonna give hereditary an a plus whoa what just happened what kind of black magic is this well let's look behind the person on the blackboard you see some delicious partial derivatives and i am starting to think that this person is not a movie critic and of course he isn't because this is joshua benjio a legendary machine learning researcher and this was an introduction video where he says this and what happened is that it has been repurposed by this new deepfake generator ai where we can type in anything we wish and out comes a near perfect result it synthesizes both the video and audio content for us but we are not quite done yet something is missing if the movie gets an a plus the gestures of the subject also have to reflect that this is a favorable review so what do we do maybe add the smile there is that possible i'm gonna give hereditary an a plus oh yes there we go amazing let's have a closer look at one more example where you can see how easily we can drop in new text with this editor why yao or worry over silly items marvel movies are not cinema now this is not the first method performing this task previous techniques typically required hours and hours of video of a target subject so how much training data does this require to perform all this well let's have a look together look this is not the same footage copy pasted three times this is a synthesized video output if we have 10 minutes of video data from the test subject this looks nearly as good has fewer sharp details but in return this requires only two and a half minutes and here comes the best part if you look here you may be able to see the difference and if you have been holding on to your paper so far now squeeze that paper because synthesizing this only required 30 seconds of video footage of the target subject my goodness but we are not nearly done yet it can do more for instance it can tone up or down

### Performance Control: Speaking Style [3:20]

the intensity of gestures to match the tone of what is being said look so how does this wizardry happen well this new technique improves two things really well one is that it can search for phonemes and other units better here is an example we crossed out the word spider and we wish to use the word fox instead and it tries to assemble this word from previous occurrences of individual sounds for instance the aux part is available when the test subject utters the word box and two it can stitch them together better than previous methods and surely this means that since

### Short Edit [4:04]

it needs less data the synthesis must take a great deal longer right no not at all the synthesis part only takes 40 seconds and even if it couldn't do this so quickly the performance control aspect where we can tone the gestures up or down or add the smile would still be an amazing selling point in and of itself but no it does all of these things quickly and with high quality at the same time wow i now invite you to look at the results carefully and give them a hard time did you find anything out of ordinary did you find this believable let me know in the comments below the authors of the paper also conducted a user study with 110 participants who were asked to look at 25 videos and say which one they felt was real the results showed that the new technique outperforms previous techniques even if they have access to 12 times more training data which is absolutely amazing but what is even better the longer the video clips were the better this method fared what a time to be alive now of course beyond the many amazing use cases of deep fakes in reviving deceased actors creating beautiful visual art redubbing movies and more we have to be vigilant about the fact that they can also be used for nefarious purposes the goal of this video is to let you and the public know that these deep fakes can now be created quickly and inexpensively and they don't require a trained scientist anymore if this can be done it is of utmost importance that we all know about it and beyond that whenever they invite me i inform key political and military decision makers about the existence and details of these techniques to make sure that they also know about these and using that knowledge they can make better decisions for us you can see me doing that here note that these talks and consultations all happen free of charge and if they keep inviting me i'll keep showing up to help with this in the future as a service to the public perceptilabs is a visual api for tensorflow carefully designed to make machine learning as intuitive as possible this gives you a faster way to build out models with more transparency into how your model is architected how it performs and how to debug it look it lets you toggle between the visual modeler and the code editor it even generates visualizations for all the model variables and gives you recommendations both during modeling and training and does all this automatically i only wish i had a tool like this when i was working on my neural networks during my phd years visit perceptilabs. com papers to easily install the free local version of their system today our thanks to perceptilabs for their support and for helping us make better videos for you thanks for watching and for your generous support and i'll see you next time