Subscribe to the channel for more AI tool breakdowns and comparisons like this!
OpenAI Image 1.5 is finally here! Let's compare it to Nano banana and see which one is better.
Links:
🔑 Free ChatGPT Prompt Templates: https://bit.ly/newsletter-aia
🧑💻 Igor Pogany on LinkedIn: https://bit.ly/IgorLinkedIn
🐦Twitter/X: https://bit.ly/AIAonTwitter
📸 Instagram: https://bit.ly/AIAinsta
Release Blog: https://openai.com/index/new-chatgpt-images-is-here/
ChatGPT: https://chatgpt.com/
Gemini: https://gemini.google.com/
So, OpenAI's chat GPT just dropped the one feature that they were really behind on versus the competition, image generation. Introducing their brand new image 1. 5 model. In this blog post, you can see a bunch of comparisons, but what we're going to do here today is the thing that I wish somebody did for me or the thing that I care about, which is various use cases and comparing them to the big competitor, which as you might know is Google Geminis's, Nano Banana, and now Nano Banana Pro. If you're not familiar, Google has Gemini, which is OpenAI's chat GBT competitor. And within that they launched this nano banana model that is just so good at not just generating images. We have a lot of good ones there but editing images. And this the story of this release for OpenAI really is them trying to catch up or overtake Google's nano banana model. So in this video I'll show you various examples. I'll show you how to use it. I'll show you what you can expect and give you a verdict on which one of them is better and if you should consider switching or if this is all you need. So, I'll just start by quickly showing you this blog post because there's a lot of great examples and if you want to get inspired on what you can do with these, this is the place to do that. What I'm going to be doing in this video is compare stuff, but this is pretty sorry. Okay, this is just good stuff here. There's good examples here. What we're going to do is we're going to show you how to use it. So, if you go into chat GPT, starting here, very simple. The old model doesn't exist anymore in there. It's just a new one. So, when you ask it to create an image of a cat with a hat, then boom, it will use the brand new model. If I open a new tab and kind of go here, you will also see that this images tab is brand new. This wasn't here before. So, so if I go here from within here, everything I type in will automatically be an image. You can also use these presets here. You can browse through them. You can also create beautiful images like the ones you saw from the blog post. It's quite simple. If I check out the cat with a hat image, you might notice that this is already rendering. And any second, we should see this appear. This thing is four times faster. That's a big deal. Four times faster than the previous model, which was very, very slow. I it was quite annoying actually how slow it was and this look at that. This is totally feasible. Plus you can also run multiple of these in parallel. Not exactly sure what the limit of that is but you can run like four images in parallel no problem which is nice and it's a solid image generator. To talk about image generation quality before we go into comparisons and maybe bit more in-depth use cases I will say this. We ran our test prompts that you might be familiar with. We have these basic test prompts and I started even like considering this much with all the image generators because look at that nano banana pro midjourney v7 flux 2 they're all just good. They're Grock image here they're all good like okay they're different flavors of good but I mean look at that if I zoom in a little bit these logos okay I mean maybe you have a preference maybe Hanuan 3. 0 Zero is the best here. I don't know. Chachi's image is also really good. It's just good. They all can do text as you can see in this book cover example here, right? All of them can do text really well. Matter of fact, this one does infographics really well just like down the banana and all the other examples. I mean, aerial photography, these are just different flavors of good. Like, I don't know. There's no real difference in my opinion. Maybe sure Journey still has the stylistic upper hand. I think it wins on style, right? But when it comes to generating images, I think it's just they're all just good. They're all really good. Now, there's a second thing to consider here, which is not image generation, but image editing. And that's where I want to pull away the screen and actually show you some of these comparisons that I prepared here for you. Because if you look at that, this is why Google took the throne and why Google was the best in terms of image models because it couldn't do just image generation but also image editing. Now chat GPT can do it too at the same level. That would be my conclusion in terms of image editing. Also image generation is at the same level with some of these top models, but image editing too. Now if I start with a cat and I say make this cat into a realistic looking mouthwatering gingerbread cookie. By the way, this is one of the presets. If you navigate to the images tab, then here the different styles you can try. That is this one. Um, so you can yeah, you can do presets like that. I would just need to select an image. You get the point. But let's have a look at this gingerbread cookie that we get from this cat. Same input image, same prompt. On the right side, we have nano banana. On the left side, we have chatgpt image 1. 5. Well, let's look at that cookie. Hm. Okay, this one is has the background, but in terms of cookie
quality, I mean, I don't know. It's the same thing in terms of like which one is better. It's and this was the theme across me looking at many of these examples and all across the internet people comparing these. It's just different flavors of good. That would be my conclusion. Now look, editing now make it wear a gingerbread Santa hat. It did it here, but it also did it here. Different flavors of good. Like you have to be extremely nitpicky to really say like okay this one is I don't know the background is more stock photoy but then like I could prompt that away and like which one is more realistic like I don't know hard to say. Let's look at another example. This one make me into Santa Claus on top of our roof climbing into a chimney. This is the same input image of me in both of these. Quick refresh you'll see it. There you go. And then this is Santa one and this is Santa 2. Which one is better? Well, arguably I would say Chhat GBT image looks a bit more like me. And this is one thing that I did notice. It resembles me better. But the thing is like honestly I have a pretty how do I put it? Generic face. Like maybe generic is not the right word but it's like well represented in the training data. similarish faces whereas some people have a face that is not very typical let's say and then the images will look different because this is just looking into training data and finding the closest approximation and I think for me definitely chipt is way better and overall from what I've seen chipt is overall more consistent on keeping a human's face and also from some tests that I've seen all across Twitter It's way better at doing multiple people like four to six people. Chat is way better than Gemini. But on this one, same kind of same. I would say maybe Chhat wins because it looks more like me than this one. This does not really I wouldn't barely recognize myself, but I think it's very similar. One more. And this is something that, you know, for me, I always try this, but no model is really good at. use this picture of me as a reference and make me into a kite surfer hitting a new personal record of a 20 meter jump height. Okay, if you don't know, I love kite surfing and it kind of does this. Now, I will say I think on this one, the Gemini one is actually way better because the proportions are kind of like this kite is not supposed to be here. This is actually right. I'm looking at the watch. Okay, this is still good. Like it's not night and day on this one. Gemini takes it home if I had to make a decision. Now, if I scroll further and say edit this image, making it look like I'm 20 meters up in the air flying next to seagulls. Okay, so if I do that, the results, well, on this one, I would argue that Chachi did a better job at editing this. It put the seagulls in there. It put me Well, it changed the ocean. Whereas on this one, it kind of well I think I'm lower down. So if I had to say on editing is this. So potato like it's not clear which one is better. It just depends on the use case. I would say overall my conclusion would be that they're very similar. Now I'll do one more and I'll show you how to do this remixing of images. If you're not familiar yet, I'll just add a face of me and then maybe I'll say create a YouTube thumbnail and we can do a sideby-side comparison shocked face promoting a revolutionary image generation AI model. Nice. Put that in here, too. Upload an image of me too. And as a final showdown, we can run both of these. Oh, should be in a new chat then. Put this in here. Okay. Send. Send at the same time. And while these results manifest and show on screen here, I want to give you my conclusion from all the examples I've seen so far. I think they're on par. I think they're both excellent at text. infographics. I think they're both excellent at turning faces into humans. Arguably, Chat GP is maybe a little better, but I think the big overarching story here is OpenAI dropped this model to be on par with Gemini. So, people don't have a reason to leave. And I think they succeeded. Last week, I did the video on the GPD 5. 2 model, which made this on par with Gemini free, which was Google's model. And with these two changes, there's not much reason to go to Gemini in my opinion because you have the image generation, you have the smart model, you have good instructions following, and arguably you
have the better face retention. I mean, let's have a look at this. I think this looks kind of like me, not entirely. And let's have a look at this final result. I mean, hey, come on, do your best, ChachiBD. A lot of people are going to judge you based on this. I personally, you tell me what you think. I think the graphics on this are better. Like from a graphic design perspective, this is better. But from a the face actually looking like me perspective, I say Chet is actually better. I say this one is closer to me than this one. I don't but in terms of graphics design, it's this one. So hey, if you want the best editing model, maybe try both. If I only had to pick one, I personally picked CH GPT. Part of the reason is because I'm already in there and using it as my daily driver. And another part of the reason is that I really like how it maintains my face and makes me look, but you'll have to try it out for yourself. It's pretty much everything I have for today. My name is Eigor. I hope this was helpful to you and go create some cool images. Final tip, creating custom Christmas cards for your family is a great gift. If you print those, that's it's personalized. It's easy. You can do it on the free accounts and it's just really fun and I think they'll appreciate it. All right, hope that helped. My name is Eor Pagani and I hope you have a wonderful day.