# Image Synthesis From Text With Deep Learning | Two Minute Papers #116

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=rAbhypxs1qQ
- **Дата:** 29.12.2016
- **Длительность:** 4:05
- **Просмотры:** 123,389

## Описание

The paper "StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks" is available here:
https://arxiv.org/abs/1612.03242

Source code for this project is also available here:
https://github.com/hanzhanggit/StackGAN

We have a Patreon post on the improvements you can expect from Two Minute Papers in 2017. Lots of goodies behind the link, have a look! https://www.patreon.com/posts/7607896

Our previous episode on Recurrent Neural Networks:
https://www.youtube.com/watch?v=Jkkjy7dVdaY

Recurrent Neural Network Writes Sentences About Images:
https://www.youtube.com/watch?v=e-WB4lfg30M

WE WOULD LIKE TO THANK OUR GENEROUS PATREON SUPPORTERS WHO MAKE TWO MINUTE PAPERS POSSIBLE:
Sunil Kim, Julian Josephs, Daniel John Benton, Dave Rushton-Smith, Benjamin Kang.
https://www.patreon.com/TwoMinutePapers

Subscribe if you would like to see more of these! - http://www.youtube.com/subscription_center?add_user=keeroyz

Music: Dat Groove by Audionautix is licensed under a Creative Commons Attribution license (https://creativecommons.org/licenses/by/4.0/)
Artist: http://audionautix.com/

Thumbnail background image credit - https://pixabay.com/photo-1616713/
Splash screen/thumbnail design: Felícia Fehér - http://felicia.hu

Károly Zsolnai-Fehér's links:
Facebook → https://www.facebook.com/TwoMinutePapers/
Twitter → https://twitter.com/karoly_zsolnai
Web → https://cg.tuwien.ac.at/~zsolnai/

## Содержание

### [0:00](https://www.youtube.com/watch?v=rAbhypxs1qQ) Segment 1 (00:00 - 04:00)

Dear fellow scholars, this is two minute papers with Koa Eher. This is what we have been waiting for. Earlier we talked about a neural network that was able to describe in a full sentence what we can see on an image and it had done a damn good job at that. Then we have talked about the technique that did something really crazy the exact opposite. We wrote a sentence and it created new images according to that. This is already incredible and we can create an algorithm like this by training not one but two neural networks. The first is the generative network that creates millions of new images and the discriminator network judges whether these are real or fake images. The generative network can improve its game based on the feedback and will create more and more realistic looking images while the discriminator network gets better and better at telling real images from fake ones. Like humans, this rivery drives both neural networks towards perfecting their crafts. This architecture is called a generative adversarial network. It is also like the classical evergoing arms race between criminals who create counterfeit money and the government which seeks to implement newer and newer measures to tell a real $100 bill from a fake one. The previous generative adversarial networks were adapt at creating new images, but due to their limitations, their image outputs were the size of a stamp at best. And we were wondering how long until we get much higher resolution images from such a system. Well, I am delighted to say that apparently within the same year in this work, a two-stage version of this architecture is proposed. The stage one network is close to the generative adversarial networks we described. And most of the fun happens in the stage two network that takes this rough lowresolution image and the text description and is told to correct the defects of the previous output and create a higher resolution version of it. In the video, the input text description and the stage one results are shown. And building on that, the higher resolution stage 2 images are presented and the results are unreal. There was a previous article and two-minute papers episode on the unreasonable effectiveness of recurrent neural networks. If that is unreasonable effectiveness, then what is this? The rate of progress in machine learning research is unlike any other field I have ever seen. I honestly can't believe what I am seeing here. Dear fellow scholars, what you see might very well be history in the making. Are there still faults in the results? Of course, there are. Are they perfect? No, they certainly aren't. However, research is all about progress, and it's almost never possible to go from zero to 100% with one new revolutionary idea. However, I am sure that in 2017, researchers will start working on generating full HD animations with an improved version of this architecture. Make sure to have a look at the paper where the ideas, challenges, and possible solutions are very clearly presented. And for now, I need some time to digest these results. Currently, I feel like being dropped into the middle of a science fiction movie. And this one will be our last video for this year. We have had an amazing year with some incredible growth on the channel. Way more of you fellow scholars decided to come with us on our journey than I would have imagined. Thank you so much for being a part of 2-minut papers. We'll be continuing full steam ahead next year and for now I wish you a merry Christmas and happy holidays. 2016 was an amazing year for research and 2017 will be even better. Stay tuned. Thanks for watching and for your generous support and I'll see you next time.

---
*Источник: https://ekstraktznaniy.ru/video/14733*