DALL-E: Text-to-Image generation - Explained!
Machine-readable: Markdown · JSON API · Site index
Описание видео
In this video, we take a look at a DALL-E for text-to-image generation. What is it? Why do we have it? How does it look?
ABOUT ME
⭕ Subscribe: https://www.youtube.com/c/CodeEmporium?sub_confirmation=1
📚 Medium Blog: https://medium.com/@dataemporium
💻 Github: https://github.com/ajhalthor
👔 LinkedIn: https://www.linkedin.com/in/ajay-halthor-477974bb/
RESOURCES
[1 📚] Slides: https://link.excalidraw.com/p/readonly/NXtiUh19HjH4BuC2IQ6V
[2 📚] DALL-E main paper: https://arxiv.org/pdf/2102.12092
[3 📚] DALL-E blog page: https://openai.com/index/dall-e/
[4 📚] Evolution of auto encoders: https://youtu.be/XyWNmHZi1oA?si=0X5iE2FKfToDaRNM
[5 📚] Colab notebook I put together to understand the gumbel distribution, gumbel max trick and Gumbel Softmax Relaxation: https://colab.research.google.com/drive/1KSKB3AIUzyMnpym8HeSVZCxOtzS-DI9u#scrollTo=1af4a395
[6 📚] Nice mathematical proof to show gumbel max trick: [https://github.com/priyammaz/PyTorch-Adventures/blob/main/PyTorch for Generation/AutoEncoders/Intro to AutoEncoders/gumbel_softmax_quantizer.ipynb](https://github.com/priyammaz/PyTorch-Adventures/blob/main/PyTorch%20for%20Generation/AutoEncoders/Intro%20to%20AutoEncoders/gumbel_softmax_quantizer.ipynb)
[7 📚] Attention is all you need paper: https://arxiv.org/pdf/1706.03762
[8 📚] Image is worth 16 x 16 words paper: https://arxiv.org/pdf/2010.11929
[9 📚] Improving generative language understanding paper: https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf
[10 📚] Learning Bounded Context-Free-Grammar via LSTM and the Transformer:
Difference and Explanations paper: https://arxiv.org/pdf/2112.09174
[11 📚] DALL-E architecture code: https://github.com/openai/DALL-E/blob/master/dall_e/encoder.py
PLAYLISTS FROM MY CHANNEL
⭕ Reinforcement Learning: https://youtube.com/playlist?list=PLTl9hO2Oobd9kS--NgVz0EPNyEmygV1Ha&si=AuThDZJwG19cgTA8
Natural Language Processing: https://youtube.com/playlist?list=PLTl9hO2Oobd_bzXUpzKMKA3liq2kj6LfE&si=LsVy8RDPu8jeO-cc
⭕ Transformers from Scratch: https://youtube.com/playlist?list=PLTl9hO2Oobd_bzXUpzKMKA3liq2kj6LfE
⭕ ChatGPT Playlist: https://youtube.com/playlist?list=PLTl9hO2Oobd9coYT6XsTraTBo4pL1j4HJ
⭕ Convolutional Neural Networks: https://youtube.com/playlist?list=PLTl9hO2Oobd9U0XHz62Lw6EgIMkQpfz74
⭕ The Math You Should Know : https://youtube.com/playlist?list=PLTl9hO2Oobd-_5sGLnbgE8Poer1Xjzz4h
⭕ Probability Theory for Machine Learning: https://youtube.com/playlist?list=PLTl9hO2Oobd9bPcq0fj91Jgk_-h1H_W3V
⭕ Coding Machine Learning: https://youtube.com/playlist?list=PLTl9hO2Oobd82vcsOnvCNzxrZOlrz3RiD
MATH COURSES (7 day free trial)
📕 Mathematics for Machine Learning: https://imp.i384100.net/MathML
📕 Calculus: https://imp.i384100.net/Calculus
📕 Statistics for Data Science: https://imp.i384100.net/AdvancedStatistics
📕 Bayesian Statistics: https://imp.i384100.net/BayesianStatistics
📕 Linear Algebra: https://imp.i384100.net/LinearAlgebra
📕 Probability: https://imp.i384100.net/Probability
OTHER RELATED COURSES (7 day free trial)
📕 ⭐ Deep Learning Specialization: https://imp.i384100.net/Deep-Learning
📕 Python for Everybody: https://imp.i384100.net/python
📕 MLOps Course: https://imp.i384100.net/MLOps
📕 Natural Language Processing (NLP): https://imp.i384100.net/NLP
📕 Machine Learning in Production: https://imp.i384100.net/MLProduction
📕 Data Science Specialization: https://imp.i384100.net/DataScience
📕 Tensorflow: https://imp.i384100.net/Tensorflow
CHAPTERS
00:00 What is DALL-E?
00:33 Why DALL-E with historical context
03:35 Components of DALL-E: dVAE and GPT
04:39 Stage 1: discrete VAE training
08:00 Stage 2: GPT training
11:38 Inference
13:36 dVAE encoder
15:58 dVAE image tokenizer
17:33 dVAE decoder
18:14 dVAE loss
20:56 Gumbel Distribution
23:20 Gumbel Max Trick
27:27 Gumbel Softmax Relaxation
29:20 Quiz Time
30:17 Summary