Midjourney AI: How Is This Even Possible?
8:30

Midjourney AI: How Is This Even Possible?

Two Minute Papers 18.04.2023 259 382 просмотров 12 184 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
❤️ Check out Weights & Biases and say hi in their community forum here: https://wandb.me/paperforum ❤️ Get more than $50 off from an upcoming W&B event in San Francisco! - https://www.fullyconnected.com?promo=2mp Try Stable Diffusion: Web 1: https://huggingface.co/spaces/stabilityai/stable-diffusion Web 2: https://beta.dreamstudio.ai/generate Web 3 (also Stable Diffusion XL!): https://clipdrop.co/stable-diffusion Web 4 (notebooks): https://github.com/TheLastBen/fast-stable-diffusion Stable Diffusion Web UI (Windows/MacOS) https://github.com/AUTOMATIC1111/stable-diffusion-webui Guide for installation: https://github.com/cmdr2/stable-diffusion-ui Draw Things app (MacOS): https://drawthings.ai/ Simpler app (MacOS): https://huggingface.co/blog/fast-mac-diffusers Guide: https://stable-diffusion-art.com/know-these-important-parameters-for-stunning-ai-images/ Midjourney (requires subscription): https://www.midjourney.com/app/ My latest paper on simulations that look almost like reality is available for free here: https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations: https://www.nature.com/articles/s41567-022-01788-5 Prompts: Ocean element girl prompt: nicoleespinoza Video game concept prompt: Herkyms Humanoid android prompt: !D3v1L_ Dog prompt: throckwoddle Bedroom prompt: Falcon Sprite girl: nicoleespinoza Cat painting: Suspect Jesse Photo of Jinx (prompt): Brandi.Eduardo Man and woman photo: TheShadowman 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Jonathan, Kenneth Davis, Klaus Busse, Kyle Davis, Lorin Atzberger, Lukas Biewald, Martin, Matthew Valle, Michael Albrecht, Michael Tedder, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Rajarshi Nigam, Ramsey Elbasheer, Richard Sundvall, Steef, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi. If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers Károly Zsolnai-Fehér's links: Twitter: https://twitter.com/twominutepapers Web: https://cg.tuwien.ac.at/~zsolnai/

Оглавление (6 сегментов)

Introduction

Dear Fellow Scholars, this is Two Minute  Papers with Dr. Károly Zsolnai-Fehér. Today we will celebrate how far we have  come. Not so long ago, I made these pictures   with OpenAI’s text to image AI, DALL-E 2,  and I was elated. These are really cool,   and have tons of personality. And if then  someone told me that just a few months later,   I will get results from a newer system  that makes this pale in comparison,   I wouldn’t have believed a word of it. And  my goodness, that is exactly what happened. You see, this is the Midjourney text to image AI,  and I was stunned when I found out that the first   version of it appeared in February 2022, just  a bit more than a year ago. And today, we are   on version 5 and we are here to celebrate how far  we have come. The results are simply unbelievable. So, let’s have a look at a fox scientist created  with version 1. Well, these results are not great.

Fox Scientist

It is still remarkable that a machine can give us  something like this, but if I didn’t tell you that   this should be a fox scientist, casting a magic  spell, I don’t think you would have guessed. And   it is not a question of getting a good randomized  run, because we can try over and over again,   and brace yourselves for some Picasso-ish results.   These aren’t much better. Perhaps, even worse. And now, hold on to your papers, and here  come the results with version 5. Oh my   goodness. Wow! Look at that quality.   I cannot believe what I am seeing   here. Can that really be? Because what you  see here is this progress in just one year.    We can even request more or less stylized  images, and it delivers over and over again. And I have to note that this was not  a very elaborately written prompt.    I just asked for a stern looking fox  in a labcoat, casting a magic spell. What’s more, there is a separate model  that we can use in Midjourney that is   specifically tailored for Japanese, anime, and  illustrative styles. And that one delivers too. And I am truly shocked to find out that looking  at the new results, the one that I previously   thought was a legendary image, really pales in  comparison. And this system can generate ten   thousand better ones every single  day. Wow. My mind is blown.

Environment Concepts

Now, we are going to explore 4 more  categories with eye-poppingly beautiful   results. First is video game environment concepts.   This is version 1 taking a crack at it. Well,  this is not the eye-poppingly beautiful result,   that’s for sure. Can you tell what the  prompt was? Neither can I, unless I look.    We were looking for a mountainous location  in a fantasy world with low-polygon models. It does have a certain mood  and I kinda like some of them,   but I cannot wait to see the results with  the new version. Look! Now we’re talking! Or, if we feel that the game needs some more  adventure here, we can let our imagination take   over and ask, for instance, for a palace.   Hmm, that looks good. I like this one too. Two, next up, photorealism. Oh boy, is it good at  that. If you are looking for a funny image of a   dog that is a little lost underwater, I would like  to ask you if you are ready to see the results   with version 1? Not for the faint of heart. So,  are you ready? Are you sure? Ok, here we go.

One Year Later

Oh my. Everything everywhere. Not  great. So what do we get today,   a year later? What? These look ridiculously  detailed. Now that is what I call an incredible   improvement. It nailed the depiction of  the frightened eyes of this poor dog. And now, little AI, I’d like an incomplete  humanoid android please. Version 1, and then,   version 5. I’ll tell you some more  about imperfections in a moment,   but I am stunned. I cannot recover from the   fact that all this improvement took  place in just one year. Loving it. Three, after photorealism, how about a little  art. A pixie girl in an underwater dreamland.    You know the drill, version 1. You know  what to expect. Yes, roughly that. Now,   the new version. How is that even possible?   These are so good, imaginative even.

New Version

And we really don’t have to hunt  for too long for these results.    The technique delivers over and over  again. Sorry, I know I keep saying this,   but all this improvement in just one  year is absolutely stunning. Can you   even imagine what we will be able to do with  this 2 more years and papers down the line? Now, four, after art, how about photos of  people? This time I won’t bore you with the   version 1 results. Let’s go straight to the  new ones. Since we can control these results   with our text prompts, in each case, we can even  keep the scene and the lighting fixed and just   change the appearance of the test subject. If we  decide that we don’t like the long straight hair,   we can ask for curly hair instead,  or change the test subject entirely. It feels like we can do absolutely anything today.   Now, if you are interested, as Midjourney requires

Conclusion

a subscription, as a substitute, you can also use  Stable Diffusion for free on the web, or run it on   your own hardware to get similar results. What is  even more mind blowing is that you can even run it   on your own phone. Then, all the computation for  this takes place in your pocket. You can carry it   anywhere you want and unleash your creativity  on the go. All this for free for everyone.    Fantastic! We will have several episodes  coming on Stable Diffusion, so if that is   something that you find interesting, consider  subscribing and click the bell icon to make   sure not to miss it. And to think that all this  is going to get faster and easier over time. Now, as promised, I’d like to show you that not  even this amazing current version is perfect.    Although hands and interactions  with hands are still a problem,   if we rerun the prompts with version 5  a few times, it usually gets them right.    Except in one scholarly case. What  is the scholarly case, you ask? Well,   like creating an icon for a robot hand holding on  to papers, I don’t know what that would be useful   for. Do you have any ideas? This did not go too  well. Look. This AI can do anything you want,   but it still can’t hold on to its  papers! What a time to be alive! So, what do you think? What would you use  this for? Let me know in the comments below! Thanks for watching and for your generous  support, and I'll see you next time!

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник