# OpenAI’s DALL-E 2: Even More Beautiful Results! 🤯

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=lbUluHiqwoA
- **Дата:** 04.06.2022
- **Длительность:** 9:51
- **Просмотры:** 477,708
- **Источник:** https://ekstraktznaniy.ru/video/13546

## Описание

❤️ Train a neural network and track your experiments with Weights & Biases here: http://wandb.me/paperintro

📝 The paper "Hierarchical Text-Conditional Image Generation with CLIP Latents" is available here:
https://openai.com/dall-e-2/

📝 Our Separable Subsurface Scattering paper with Activision-Blizzard:
https://users.cg.tuwien.ac.at/zsolnai/gfx/separable-subsurface-scattering-with-activision-blizzard/

📝Our earlier papers with the caustics:
https://users.cg.tuwien.ac.at/zsolnai/gfx/photorealistic-material-editing/
https://users.cg.tuwien.ac.at/zsolnai/gfx/adaptive_metropolis/

Try it out: https://www.craiyon.com (once again, note that this is an unofficial and reduced version. it also runs through gradio, which is pretty cool, check it out!)

❤️ Watch these videos in early access on our Patreon page or join us here on YouTube: 
- https://www.patreon.com/TwoMinutePapers
- https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg/join

🙏 We would like to thank our generous Patreon suppo

## Транскрипт

### What is DALL-E 2? []

Dear Fellow Scholars, this is Two Minute  Papers with Dr. Károly Zsolnai-Fehér. Today we are going to play some more with OpenAI’s  amazing technique DALL-E 2, where we can write   a text description, and the AI creates an  amazing image of exactly that. That sounds cool,   and it gets even cooler the  crazier the ideas we give to it. This is an AI endowed with a diffusion-based  model, which means that when we ask it something,   it starts out from noise, and over  time, it iteratively refines this image   to match our description better. And, over  time, magically, an absolutely incredible image   emerges. This is a neural network that is given  a ton of images, and a piece of text description   that says what is in this image. That  is one image-caption pair. DALL-E 2   is given millions and millions of these pairs.   So, what can it do with all this knowledge? Well, the best part is that it can combine  things! Oh yes, this is the key. This does   not copy images from this training data, but  it truly comes up with novel images. How?    Well, after it had seen a  bunch of images of koalas,   and separately, a motorcycles,  it starts to understand the concept of both,   and it will be able to combine the two  together into a completely new image. And here you can see how much DALL-E  2 has improved since DALL-E 1. It is

### DALL-E 1 vs DALL-E 2 [1:40]

on a completely different  level from its first iteration.    This is so much better, and once again,  this is an AI that was improved this much   in just a year? I can hardly believe what  I am seeing here. What a time to be alive! And now that some time has passed, new,  interesting capabilities have emerged.    Now, hold on to your papers, and let’s  have a look at 10 more amazing examples.

### 1 - It can make videos [2:15]

One, for instance, it can even create not  just images, but small videos. Videos?    How? Well, look at this video where a victorian  house is getting modernized! Wow. So what is going   on here? Well, we can enter a text prompt, get an  image, then change the text just a tiny bit, get a   slightly changed image, and repeat the process  over and over until we get an amazing video   like this. We can also run this process backwards  and victorianize a modern building as well. Two, if you don’t believe that it can  combine several existing concepts into   something that definitely does not exist, hold  on to your papers, and have a look at this one.

### 2 - Leonardo da Apple [3:04]

Oh yes, Apple products, Leonardo Da Vinci style.    This is truly insane. If all this does not feel  like humanlike intelligence, I don’t know what is.

### 3 - Robot learns a new language [3:19]

Three, here is how the AI imagines  a robot learning a new language. Four, it can create new kinds of drinks, and,  my goodness. Well, I am super happy now. You are

### 4 - New AI-generated drinks! [3:28]

probably asking, Károly, why are you super happy?   Well, I am a light transport researcher by trade,   and I spend a great deal of my time computing  caustics. Oh yes, caustics are these beautiful   patterns that emerge when we hit a reflective  or refractive object with light just the right   way. And these can look especially magical  if we have an object with a complex geometry.    And the fact that the AI also understands this  about our world. I am truly speechless. Beautiful.

### 5 - Toilet car [4:12]

Five, it can also create new  inventions. This is a toilet car.    You know, some people are quite busy and  if you have to go when you are on the go,   well, this one is for you.   What a time to be alive!

### 6 - Lightbulbs! [4:27]

Six, in this one, the AI puts up a clinic in  understanding combinations of concepts. Check   this out! This is plants surrounding a lightbulb.   Now, a lightbulb surrounding some plants.    Now a plant with a lightbulb inside.   And a lightbulb with plants inside.    It can follow these instructions  really well an all of these cases.

### 7 - Murals [4:56]

Seven, it can also create larger images too, so  much so, that entire murals can be requested.    And here, not just the quality, but the  variety of results is truly a sight to behold.

### 8 - Darth Ant [5:11]

Eight, this is the famous Sith Lord, Darth Ant.    Oh yes. This is Darth Vader  reimagined as a robot ant. Loving it.

### 9 - Text! [5:23]

Nine, if you remember, previously, when we  requested that it writes a piece of text on   a sign, it floundered a great deal. This  sign is supposed to say “deep learning”.    And, look! The amazing Peter Welinder found  a way to make it write things on signs   properly. And all this with an amazing depth of  field effect. Once again, this is an AI where a   vast body of knowledge lies within, but it only  emerges if we can bring it out with properly   written prompts. It almost feels like a new  kind of programming that is open to everyone,   even people without any programming or  technical knowledge. This is prompt engineering,   if you will. Perhaps a new kind of job  that is just coming into existence.

### 10 - Pro photography [6:14]

Ten, now check this out. We can even give  instructions to it as a photographer would   instruct its camera, request mammatus clouds, we  marveled together at a simulation of those in an

### Subsurface scattering! [6:28]

earlier video. And, oh my goodness. That cannot  be true. Look. The hand has subsurface scattering.    What is that? That is the effect  of light penetrating the skin,   bouncing around, and either coming out on the  same, or the other side. It has this absolutely   beautiful halo effect. We worked on this a bit  in an earlier paper together with Activision   Blizzard, and it took a great deal of mathematics  and physics to perform this efficiently in a   computer graphics engine. And now, the AI just  knows what it is. I really don’t know what to say.

### Try it out yourself! [7:10]

And, as always, +1 because I couldn’t resist:  if you are interested in trying a reduced   version of DALL-E, check this out. The link is  available in the video description. Once again,   please note that this is a  highly reduced version of DALL-E,   but it is still quite fun.   Let the experiments begin! And, I also cannot wait to get access to the full  model, some Two Minute Papers mascot figures, and   obviously, images of wise scholars holding on  to their papers must to come into existence!

### Changing the world [7:45]

I am sure this tool will democratize art  creation by putting it into the hands of all   of us. We all have so many ideas and so little  time. And DALL-E will help with exactly that.    So cool. Just imagine having an AI artist that is,  let’s say, just half as good as a human artist,   but the AI can paint 10000 images a day for  free. Cover art for your album? Illustrations   for your novel? Not a problem. A little brain in  a jar for everyone, if you will. And, now, it’s   a really good one. Bravo OpenAI. I am starting to  believe that the impact of this AI is going to be   so strong, there will be the world as we know  it before DALL-E, and the world after it. So,   does this get your mind going? What else would you  use this for? Let me know in the comments below! Thanks for watching and for your generous  support, and I'll see you next time!
