Midjourney AI: How Is This Even Possible?

8:30

Midjourney AI: How Is This Even Possible?

Two Minute Papers 18.04.2023 259 382 просмотров 12 184 лайков

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

❤️ Check out Weights & Biases and say hi in their community forum here: https://wandb.me/paperforum ❤️ Get more than $50 off from an upcoming W&B event in San Francisco! - https://www.fullyconnected.com?promo=2mp Try Stable Diffusion: Web 1: https://huggingface.co/spaces/stabilityai/stable-diffusion Web 2: https://beta.dreamstudio.ai/generate Web 3 (also Stable Diffusion XL!): https://clipdrop.co/stable-diffusion Web 4 (notebooks): https://github.com/TheLastBen/fast-stable-diffusion Stable Diffusion Web UI (Windows/MacOS) https://github.com/AUTOMATIC1111/stable-diffusion-webui Guide for installation: https://github.com/cmdr2/stable-diffusion-ui Draw Things app (MacOS): https://drawthings.ai/ Simpler app (MacOS): https://huggingface.co/blog/fast-mac-diffusers Guide: https://stable-diffusion-art.com/know-these-important-parameters-for-stunning-ai-images/ Midjourney (requires subscription): https://www.midjourney.com/app/ My latest paper on simulations that look almost like reality is available for free here: https://rdcu.be/cWPfD Or this is the orig. Nature Physics link with clickable citations: https://www.nature.com/articles/s41567-022-01788-5 Prompts: Ocean element girl prompt: nicoleespinoza Video game concept prompt: Herkyms Humanoid android prompt: !D3v1L_ Dog prompt: throckwoddle Bedroom prompt: Falcon Sprite girl: nicoleespinoza Cat painting: Suspect Jesse Photo of Jinx (prompt): Brandi.Eduardo Man and woman photo: TheShadowman 🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible: Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Benji Rabhan, Bryan Learn, B Shang, Christian Ahlin, Eric Martel, Geronimo Moralez, Gordon Child, Jace O'Brien, Jack Lukic, John Le, Jonathan, Kenneth Davis, Klaus Busse, Kyle Davis, Lorin Atzberger, Lukas Biewald, Martin, Matthew Valle, Michael Albrecht, Michael Tedder, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Rajarshi Nigam, Ramsey Elbasheer, Richard Sundvall, Steef, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi. If you wish to appear here or pick up other perks, click here: https://www.patreon.com/TwoMinutePapers Károly Zsolnai-Fehér's links: Twitter: https://twitter.com/twominutepapers Web: https://cg.tuwien.ac.at/~zsolnai/

Оглавление (6 сегментов)

Introduction

Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. Today we will celebrate how far we have come. Not so long ago, I made these pictures with OpenAI’s text to image AI, DALL-E 2, and I was elated. These are really cool, and have tons of personality. And if then someone told me that just a few months later, I will get results from a newer system that makes this pale in comparison, I wouldn’t have believed a word of it. And my goodness, that is exactly what happened. You see, this is the Midjourney text to image AI, and I was stunned when I found out that the first version of it appeared in February 2022, just a bit more than a year ago. And today, we are on version 5 and we are here to celebrate how far we have come. The results are simply unbelievable. So, let’s have a look at a fox scientist created with version 1. Well, these results are not great.

Fox Scientist

It is still remarkable that a machine can give us something like this, but if I didn’t tell you that this should be a fox scientist, casting a magic spell, I don’t think you would have guessed. And it is not a question of getting a good randomized run, because we can try over and over again, and brace yourselves for some Picasso-ish results. These aren’t much better. Perhaps, even worse. And now, hold on to your papers, and here come the results with version 5. Oh my goodness. Wow! Look at that quality. I cannot believe what I am seeing here. Can that really be? Because what you see here is this progress in just one year. We can even request more or less stylized images, and it delivers over and over again. And I have to note that this was not a very elaborately written prompt. I just asked for a stern looking fox in a labcoat, casting a magic spell. What’s more, there is a separate model that we can use in Midjourney that is specifically tailored for Japanese, anime, and illustrative styles. And that one delivers too. And I am truly shocked to find out that looking at the new results, the one that I previously thought was a legendary image, really pales in comparison. And this system can generate ten thousand better ones every single day. Wow. My mind is blown.

Environment Concepts

Now, we are going to explore 4 more categories with eye-poppingly beautiful results. First is video game environment concepts. This is version 1 taking a crack at it. Well, this is not the eye-poppingly beautiful result, that’s for sure. Can you tell what the prompt was? Neither can I, unless I look. We were looking for a mountainous location in a fantasy world with low-polygon models. It does have a certain mood and I kinda like some of them, but I cannot wait to see the results with the new version. Look! Now we’re talking! Or, if we feel that the game needs some more adventure here, we can let our imagination take over and ask, for instance, for a palace. Hmm, that looks good. I like this one too. Two, next up, photorealism. Oh boy, is it good at that. If you are looking for a funny image of a dog that is a little lost underwater, I would like to ask you if you are ready to see the results with version 1? Not for the faint of heart. So, are you ready? Are you sure? Ok, here we go.

One Year Later

Oh my. Everything everywhere. Not great. So what do we get today, a year later? What? These look ridiculously detailed. Now that is what I call an incredible improvement. It nailed the depiction of the frightened eyes of this poor dog. And now, little AI, I’d like an incomplete humanoid android please. Version 1, and then, version 5. I’ll tell you some more about imperfections in a moment, but I am stunned. I cannot recover from the fact that all this improvement took place in just one year. Loving it. Three, after photorealism, how about a little art. A pixie girl in an underwater dreamland. You know the drill, version 1. You know what to expect. Yes, roughly that. Now, the new version. How is that even possible? These are so good, imaginative even.

New Version

And we really don’t have to hunt for too long for these results. The technique delivers over and over again. Sorry, I know I keep saying this, but all this improvement in just one year is absolutely stunning. Can you even imagine what we will be able to do with this 2 more years and papers down the line? Now, four, after art, how about photos of people? This time I won’t bore you with the version 1 results. Let’s go straight to the new ones. Since we can control these results with our text prompts, in each case, we can even keep the scene and the lighting fixed and just change the appearance of the test subject. If we decide that we don’t like the long straight hair, we can ask for curly hair instead, or change the test subject entirely. It feels like we can do absolutely anything today. Now, if you are interested, as Midjourney requires

Conclusion

a subscription, as a substitute, you can also use Stable Diffusion for free on the web, or run it on your own hardware to get similar results. What is even more mind blowing is that you can even run it on your own phone. Then, all the computation for this takes place in your pocket. You can carry it anywhere you want and unleash your creativity on the go. All this for free for everyone. Fantastic! We will have several episodes coming on Stable Diffusion, so if that is something that you find interesting, consider subscribing and click the bell icon to make sure not to miss it. And to think that all this is going to get faster and easier over time. Now, as promised, I’d like to show you that not even this amazing current version is perfect. Although hands and interactions with hands are still a problem, if we rerun the prompts with version 5 a few times, it usually gets them right. Except in one scholarly case. What is the scholarly case, you ask? Well, like creating an icon for a robot hand holding on to papers, I don’t know what that would be useful for. Do you have any ideas? This did not go too well. Look. This AI can do anything you want, but it still can’t hold on to its papers! What a time to be alive! So, what do you think? What would you use this for? Let me know in the comments below! Thanks for watching and for your generous support, and I'll see you next time!

Другие видео автора — Two Minute Papers

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник