# Stable Diffusion Got Supercharged - For Free!

## Метаданные

- **Канал:** Two Minute Papers
- **YouTube:** https://www.youtube.com/watch?v=1RvZWHtFXuY
- **Дата:** 22.05.2023
- **Длительность:** 6:24
- **Просмотры:** 136,519

## Описание

❤️ Check out Weights & Biases and say hi in their community forum here: https://wandb.me/paperforum
❤️ Get more than $50 off from an upcoming W&B event in San Francisco! - https://shorturl.at/brtIQ

📝 The paper "Adding Conditional Control to Text-to-Image Diffusion Models" is available here:
https://arxiv.org/abs/2302.05543

Try it out!
ControlNet - https://github.com/lllyasviel/ControlNet
ControlNet guide - how install and use it: https://stable-diffusion-art.com/controlnet/

Transform yourself (dance) - https://www.reddit.com/r/StableDiffusion/comments/12i9qr7/i_transform_real_person_dancing_to_animation/
Group photo (cartooning) - https://reddit.com/r/StableDiffusion/comments/12nd60i/turn_a_group_photo_into_a_digital_painting_with/
Decartooning - https://reddit.com/r/StableDiffusion/comments/1377c0l/decartooning_using_regional_prompter_controlnet/
W&B logo: https://twitter.com/weights_biases/status/1643261405771055105
Video as control signal: https://www.reddit.com/r/StableDiffusion/comments/12keizv/sdcnanimation_v04_update_is_out_separate_flow/

My latest paper on simulations that look almost like reality is available for free here:
https://rdcu.be/cWPfD 

Or this is the orig. Nature Physics link with clickable citations:
https://www.nature.com/articles/s41567-022-01788-5

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Jonathan, Kenneth Davis, Klaus Busse, Kyle Davis, Lorin Atzberger, Lukas Biewald, Martin, Matthew Valle, Michael Albrecht, Michael Tedder, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Rajarshi Nigam, Ramsey Elbasheer, Richard Sundvall, Steef, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers

Thumbnail background design: Felícia Zsolnai-Fehér - http://felicia.hu

Károly Zsolnai-Fehér's links:
Twitter: https://twitter.com/twominutepapers
Web: https://cg.tuwien.ac.at/~zsolnai/

#stablediffusion

## Содержание

### [0:00](https://www.youtube.com/watch?v=1RvZWHtFXuY) Segment 1 (00:00 - 05:00)

Dear Fellow Scholars, this is Two Minute Papers  with Dr. Károly Zsolnai-Fehér. Dr. Papers. Today we are going to look at an incredible  way to supercharge Stable Diffusion,   which is a free an open text to image  AI. Text goes in, beautiful images come   out. We all know that. However, this  is ControlNet, a new neural network   structure that helps us give it additional  inputs. What does that mean? More control. For instance, we can provide just the edges  of an input image. This can be a rough sketch,   or edges extracted from a real photo,  and bam, we get a beautiful image with   exactly that creature and framing.   Wow, that is exactly what I want! We can also provide this thing  that we call a boundary map,   and ControlNet is able to follow that too. Do you know what else also works?   Segmentation maps. This is just   a rough draft of where different  objects should be, and my goodness,   out comes a beautiful piece of landscape  and architecture, or even an interior. And   just look at that. All this came from these  really coarse, poor images. And don’t forget,   this works both ways, so we can even give  the AI a photo, ask for the segmentation map,   and then add a prompt to make it a  painting, remove a tree or anything else. Pose works too. A stickman goes in,   and a photorealistic image  comes out. That is incredible. So, what else can we do with ControlNet? Well,   after looking at these 4 amazing  examples that I am about to show you,   by the end of this video, you will see  that a more appropriate question would   be what can’t we do with it. I will also tell  you whether you can try it or not in a moment. Now hold on to your papers Fellow Scholars,  because this is going to be insane. One,   with ControlNet, we can even transform ourselves.   In goes a dancing video of us, and out comes this.    Is this really happening? I cannot  believe this. Just look at how good   the temporal coherence is! Yes, there are  some unexpected changes from image to image,   a little flickering, but this is improving  so fast, two more papers down the line,   I bet my papers that no one is going to  be able to tell. Just imagine how artists,   and even non-artists will be able to unleash their  creativity with this. Don’t have an expensive   camera to record this? Not a problem. No access to  expensive lighting equipment? Not a problem. Heck,   you don’t even need to be at the location,  it can be synthesized! This is incredible. Two, we can even create a group photo,  and we can all enter a virtual world   where we can be recreated with the  same pose, character by character.    How cool is that? So this was cartooning.   And, are you thinking what I am thinking? Oh yes, three, decartooning  is also possible. This way,   we can make cartoons and even our own drawn  characters come to life. I really like how   the skirt has been reimagined in a way that  it could really exist in the real world. Four, Weights and Biases also asked the AI to  reimagine their logo. And it did not disappoint.    Some of these are really imaginative. And wait,  this is not just style transfer. Look at this   animation example where the original real  footage is used as a control signal. First,   these indeed look like style transfer into  a watercolor or oil painting, but now,   look! It can truly reimagine the footage. Amazing! And the best part about ControlNet  is whenever I am working with a tool   like this I feel more satisfied, and more  strongly involved in the creative process.    I am not just prompting, I feel like I  am at least partly creating something. I also feel like people are sleeping on this  paper. They don’t yet know that the world of   digital art as we know it will change very  quickly. More of us will have access to these   fantastic tools that we could never afford  to. And now it is available for all of us,   for free! The possibilities truly  make my head spin. And this is just   one paper. And this is a technique  that is free and open to everyone,   we build this together, so you Fellow Scholars  are all welcome to contribute prompts, code,   experiments, anything. This gives power to  the people for free, that's what I want. Yes,   I put a link to it in the video description  and to a guide too, so you can try it for free.

### [5:00](https://www.youtube.com/watch?v=1RvZWHtFXuY&t=300s) Segment 2 (05:00 - 06:00)

And one more interesting tidbit from the paper  - this neural network can even work with smaller   datasets, so if we have less than 50 thousand  images, that is completely fine, wow, but it   can scale to billions of images if we so desire.   A big thanks for the Stable Diffusion subreddit   as they host a ton of amazing artistic works with  ControlNet. And now, let the experiments begin! Thanks for watching and for your generous  support, and I'll see you next time!

---
*Источник: https://ekstraktznaniy.ru/video/13164*