# NVIDIA Vid2Vid: AI-Based Video-to-Video Synthesis!

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=GRQuRcpf5Gc
- **Дата:** 09.09.2018
- **Длительность:** 3:37
- **Просмотры:** 140,074
- **Источник:** https://ekstraktznaniy.ru/video/14417

## Описание

The paper "Video-to-Video Synthesis" and its source code is available here:
https://tcwang0509.github.io/vid2vid/
https://github.com/NVIDIA/vid2vid

Pick up cool perks on our Patreon page: https://www.patreon.com/TwoMinutePapers

We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
313V, Andrew Melnychuk, Angelos Evripiotis, Brian Gilman, Christian Ahlin, Christoph Jadanowski, Dennis Abts, Emmanuel, Eric Haddad, Eric Martel, Esa Turkulainen, Evan Breznyik, Geronimo Moralez, Kjartan Olason, Lorin Atzberger, Marten Rauschenberg, Michael Albrecht, Michael Jensen, Milan Lajtoš, Morten Punnerud Engelstad, Nader Shakerin, Owen Skarpness, Raul Araújo da Silva, Rob Rowe, Robin Graham, Ryan Monsurate, Shawn Azman, Steef, Steve Messina, Sunil Kim, Thomas Krcmar, Torsten Reil, Zach Boldyga.
https://www.patreon.com/TwoMinutePapers

Crypto and PayPal links are available below. Thank you very much for your generous support!
Bitcoin: 13hhmJnLEzwXgmgJN7RB6bWVdT7W

## Транскрипт

### Segment 1 (00:00 - 03:00) []

dear fellow scholars this is two minute papers with károly on IFA here do you remember the amazing pics to pix algorithm from last year it was able to perform image translation which means that it could take a day time image and translate it into a night time image create maps from satellite images or create photorealistic shoes from crude drawings I remember that I almost fell off the chair when I first seen the results but this new algorithm takes it up a notch and transforms these edge maps into human faces not only that but it also animates them in time as you see here it also takes into consideration the fact that the same edges may result in many different faces and therefore it is also willing to give us more of these options if I fell out of the chair for the still image version I don't really know what the appropriate reaction would be to this it can also take the crude map of labels where each color corresponds to one object class such as roads cars or buildings and it follows how our labels evolve in time and creates an animation out of it we can also change the meaning of our labels easily for instance in the lower left you see how the buildings are now suddenly transformed to trees or we can also change the trees to become buildings do you remember motion transfer from a couple of videos ago it can do a similar variant of that too and even synthesizes the shadows around the character in a reasonably correct manner as you see the temporal coherence of this technique is second to none which means that it remembers what it did with past images and doesn't do anything drastically different for the next frame and therefore generates smoother videos this is very apparent especially when juxtaposed with the previous Peaks to Peaks method so there are three key differences from the previous technique to achieve this one the original architecture uses a generator network to create images where there is also a separate discriminator network that judges its work and teaches it to do better instead this work uses two discriminator neural networks one checks whether the images look good one by one and one more discriminator for overlooking whether the sequence of these images would pass as a video this discriminator cracks down on the generator network if it creates sequences that are not temporally coherent and this is why we have minimal flickering in the output videos fantastic idea to ease the training process it also does it progressively which means that the network is first faced with an easier version of the problem that progressively gets harder over time if you have a look at the paper you will see that the training is both progressive in terms of space and time I love this idea 2/3 it also uses a flow map that describes the changes that took place since the previous frame note that these previous peaks to Peaks algorithm was published in 2017 a little more than a year ago I think that is a good taste of the pace of progress in machine learning research up to 2k resolution 30 seconds of video and the source code is also available congratulations folks this paper is something else thanks for watching and for your generous support now see you next time