Full courses + unlimited support: https://www.skool.com/ai-automation-society-plus/about
All my FREE resources: https://www.skool.com/ai-automation-society/about
Apply for my YT podcast: https://podcast.nateherk.com/apply
Work with me: https://uppitai.com/
My Tools💻
14 day FREE n8n trial: https://n8n.partnerlinks.io/22crlu8afq5r
Code NATEHERK to Self-Host n8n for 10% off (annual plan): http://hostinger.com/nateherk
Voice to text: https://ref.wisprflow.ai/nateherk
In this video, I show you how to use Antigravity with Nano Banana 2 to generate insane images with perfect text using JSON prompting. JSON prompting gives us so much more control over the image generation process, and the results are way more consistent than traditional prompting.
Because of that, I built a skill around this workflow so that whenever I want to create an image for a video or any other project, Gemini 3.1 Pro in Antigravity automatically uses the skill to craft a perfect JSON prompt, which then generates a perfect image every time. I walk you through exactly how all of this works step by step and show you how to set it up for yourself so you can start getting the same results.
Sponsorship Inquiries:
📧 sponsorships@nateherk.com
0:00 - Nano Banana 2
3:44 - JSON Prompting Explained
6:54 - Anti-Gravity Project Setup & Live Demo
12:40 - Pricing & Final Thoughts
Оглавление (4 сегментов)
Nano Banana 2
So, Google's new image model, Nano Banana 2, just dropped, and it's cheaper, faster, and smarter than Nano Banana Pro. And those are all good things. So, I've been obsessing over it since it's come out, and I found a way to pair it with anti-gravity to make the outputs more consistent, better, and cheaper. So, in this video, I'm going to show you guys exactly what I mean by that, and give you everything you need for free to copy this system. So, let's not waste any time and get right into it. All right, let's just start off by looking at some examples and appreciating how good this model truly is. I mean, look how realistic that looks. Look at this. the details on his face, the before and after shot, panoramic photography. We've got text on this that could be used as an ad. This is Sydney Sweeney as well. This just looks like a real selfie. And even being able to redesign game UIs and things like that. Now, all of these examples that we just looked at are things that would have normally taken time, equipment, money, and even good luck. If you needed to do stuff like, you know, photography, you would need good weather or sunset or whatever it is. But I know you guys all know this. So, I'm going to talk about why is Nano Banana 2 actually better. It got three major upgrades in my opinion, which was first of all, it's much faster. The second one is better control. So, I feel like when I'm talking to it, it actually understands me better. And the third one is it can use real data and it actually does some digging on its own. So, it's so much smarter. And of course, number four, we cannot forget the fact that this is cheaper than Nano Banana Pro. So, one of the things that really excited me was the quality leap of the actual text because I've been using Nano Banana Pro to create a lot of these infographics or thumbnails and all the diagrams that you're looking at today were generated with the new Nano Banana 2 model and I've never had it actually misspell anything. With Nano Banana Pro, it was misspelling things or having weird characters all the time. So, here's a quick example. All I said was create me an infographic about how photosynthesis works. It has all these arrows. It has all the spatial elements to think about and it's getting all of these words correct. At least I think it is because I don't know what these actually should be. Maybe I should have picked a different example. So, we'll actually just try it out super quick to see some other text. You can go right now to Gemini and you can click on tools, create image, and you can play around with Nano Banana 2 right now. As you can see, you could also go to Google AI Studio if you want to use this over API. And this actually is Gemini 3. 1 flash image preview, but it's Nanoban 2. It's just way more fun to say Nanoan 2. But real quick, take note of the cost if you use this over API. And later, I'm going to show you how we can actually use it for cheaper. But this image came back in about 10 seconds. And if I click this open, we should see that we don't have any misspellings. It also has in the very bottom right an explicit mention of this was made with AI. And so, yes, I do believe that most AI models would understand how photosynthesis works, but you can see that we can also enable Google search, which says use Google Web Search grounding to generate images based on real-time information. As you can see right here in Google AI Studio, it says that Nano Banana 2 has a knowledge cutoff of January 2025. So, because we can enable Google search in the image generation, it's just super cool. And just as a real quick funny example, I said, "Create me an image of the cast of The Office doing some of their favorite hobbies. " And I ran this with Google search off and with Google search on. It's still pretty accurate with Google search off if you really want to analyze this. But one thing I did notice is right here it says Scranton's own the whiskey drinkers. But on this right hand side with Google search, it said scrantinicity which is to my memory correct. And then I ran another one with friends. I know these shows have been around for a long time. So maybe it's in the training data of the model, but still thought it was cool because it also knows the setting of the show and knows what all the characters look like. So we're getting pro quality. We're getting flash speed into Nano Banana. It's sharp. It's fast. It's smart. So here's an example where I said create a hyperrealistic image of Derek Jeter playing the piano. On the left was Nano Banana Pro and on the right was Nano Banana 2. Now, I will say both of these images look really good, right? Like that looks very realistic and he's playing the piano. This one looks good as well. But when you think about the fact that this one was slower and more expensive than Nano Banana 2, then it's a clear winner in my mind. Now, here are
JSON Prompting Explained
two examples that I saw on Twitter that I thought were really, really good. I probably would have thought that this was a real model, and once again a real model for this, you know, photo shoot. But, of course, the key to good outputs is the prompt. And so, why are these examples so good? because they're using a JSON prompting structure, which basically means that we have so much more control on what actually happens and we can communicate so much clearer with Nano Banana 2. Computers love JSON. AI loves JSON. It's really easy for them to read it and understand it because there's specific arguments. In this version, it knows exactly what's the prompt. It knows the style. It knows the lighting, the camera, the resolution, and the aspect ratio. Which is why if you're ever manually setting up an HTTP request, you're sending over that body as JSON because AI or computers can process that super quickly and understand it. Here's the next example. This one is another JSON prompt. It's a little bit different. It was much longer and it has different arguments. You can see here we have the subject. We have the composition with layout, camera angle, framing, camera height, lens, focus, depth of field. So, we can get super granular here. And that's really the key to having an output that looks good but AI generated or so good that you don't even know it's AI generated. We have the character, we have the expression, the before state of the lips. We have the volume, hydration, lighting, type, consistency, color grading. I think you guys understand the point. Plain text image prompts are kind of vague. They're a bit more random and they don't have as much control, which will basically just lead to outputs that are inconsistent. Not to say that they'd be bad because the image models are so good now. But every time that you, you know, chuck a prompt in there, it's almost like you're pulling a lever at a slot machine. You're kind of just hoping you get what you want, but you just don't know. But with JSON prompt, you can actually know you're going to get everything you want because you have, like I said, these arguments. You can be precise and you have way more control. You're going to get so much more consistency here. And that's exactly what I'm going to show you guys today. Now, before you guys run to the comments and say, "I am not writing that JSON myself, or he's going to charge us $500 to learn how to JSON prompt," just bear with me. AI loves JSON, which means AI can write really good JSON. So, let me walk you guys through an example. I sent off this prompt to Nano Banana 2, create a close-up image of the right half of a woman's face, and it came back with this, which is solid, right? But it still looks a little bit AI. So, what we can do is we can basically take that prompt and shoot that off to Gemini 3 Pro inside of anti-gravity. And now anti-gravity uses that super smart brain right here combined with the skill which basically is a prompt that we gave it to say hey whenever you want to create an image do this and this and they'll be really good which I will be giving you guys that skill for completely free and now it takes this super vague request it uses its skill and it turns it into a JSON prompt we've got the prompt we've got a negative prompt which is things that we don't want we've got the setting with resolution style lighting camera angle things like that and then we get an output that I think looks much better and way more realist istic. I mean, look at like the pores and the blemishes. And it actually followed what I said, which was I just want the right half of the face. So, that's what I'm about to show you guys. Using Gemini 3 Pro to write us the JSON prompts to create the Nano Banana 2 images, and we get way more consistent results. And if you don't already believe me, every single one of these little images has had the exact same styling, and that's because I'm JSON prompting it to do so. Okay, so now we
Anti-Gravity Project Setup & Live Demo
are in anti-gravity, which is basically just an ID. If you've been following my channel for a while, it's essentially the same thing as VS Code. You've got the extensions over here. You've got the file explorer. You've got your files. And then you've got the anti-gravity agent over here. And you could even use cloud code within anti-gravity. But right now, we're going to be using the anti-gravity agent because we want to use Gemini 3. 1 Pro because I think that 3. 1 Pro is probably the best when it comes to like aesthetics and visual design. If you don't have anti-gravity, it's completely free to download. Just go to anti-gravity. google and then download it for your operating system. Now, once you're in here, what you're going to do is you're going to click on file. You're going to go ahead and open a folder and just create a new one anywhere in your, you know, desktop and then open it up. It won't look like mine because there'll be no files in here, but I will get you guys caught up to speed. So, what do we have over here on this lefth hand side? Well, first thing we have is the gemini. md, which if you've been using cloud code, this is essentially the exact same thing as a claw. md. This is telling my agent here how to set up this project and what to do. So, I'm saying it is critical to always keep prompts and images organized in this structure. I tell it about the image generation workflow using the nanobanana skill and I'm basically just explaining the rules for this project because how the gemini. md file works is every time before the agent actually reads your question that you shoot off to it reads this first. It's basically just like a master system prompt. So you ask a question it reads gemini. md and then it gives you a lot more predictable output. Honestly a very similar thing to a skill. Now what else do we have in here that I want to call your attention to? You can see obviously we've got the skill which is nano banana image generation. And if I open this up, this is pretty much the exact same thing as a cloud code skill. These skills are just markdown files and you can switch between different AI models. If you guys want to grab this skill, if you want to grab the gemini. md, anything else that you need for this video specifically, just go over to my free school community. Link for that is down in the description. Go to the classroom and then you can access this agent skills classroom which has everything you need. So, I'm not going to read this whole thing because you guys can grab it. But the skill basically explains how to JSON prompt. It tells Gemini 3 Pro how to do all of this. I show some schema stuff. I show what to put in which arguments. And what you'll notice here is I also say if you need to understand more and you want to look at different options, you can look at the master reference guide which is right here. And this is basically the exact same thing, but it goes way way more in depth. So you guys will get access to all of that. So basically the way that skills work is you will ask the agent a question and it will search through all of the existing skills. So if you asked a question about code, it would look through the skills. It would say, "Okay, cool. I have a code skill. let me use that so that I have a better output, a more predictable output. And then finally, the last thing is that we have two folders I wanted you to look at. We have images and we have prompts. And both of these have been organized into infographics, miscellaneous, people, photography, and products. So every time that we generate an image, let's go to the people folder. Let's say we do this one. We can see the actual JSON prompt that it wrote for us. And then if we open up that same thing in our images folder, we could go ahead and open it up and we would see the actual image that was created from that prompt. And as you can see, this one once again looks amazing. Okay, now that you guys understand a little bit about what's going on in the project, let's do a demo. I'm going to go over here to the agent and send over a prompt. Okay, so as you can see, I said, "I need you to generate me three different images with different styles so I can see which one I like best. " And all I told it as far as a prompt was a young woman holding up a beauty product. So, you can see what it's doing. It searched for different skills and now it's analyzing the nanobanana image skill. What I love about these tools is you can literally watch them think and you can see exactly what they're doing. So look at this. It actually analyzed a JSON prompt from earlier because maybe it wants to pull some of those elements to make this prompt even better. And what it does is it creates basically a to-do list so we can visually see where it is. It will start to check off these things as it goes through. So you can see style one is going to be documentary realism, style two lifestyle influencer and style three is going to be vintage film. All right, so that just finished up. We've got our three new images and they're all in the people folder. So, the first one is the documentary realism. Okay, that looks pretty good. Everything here looks very realistic. The product looks good as well. Everything's spelled correctly. The second one was lifestyle. So, you can see it looks really, really good. Very realistic. And this one is just like she's in her bathroom. And then we had this one, which was vintage, which I don't honestly love. I mean, it still looks good. It still looks like a real photo for sure, but this isn't the style that I'd want. So, now that we figured out we like style 2, we could go ahead and generate more in that exact style because those prompts are saved right in here. Prompts, people, and then we have these things right there. Now, this is also really helpful not just for people, but for products as well. So, let's say you uploaded a picture of a product that you have. Nano Banana 2 can not only turn text into images, but also images into images. So, I just threw in an old picture of a random bottle of cologne, and I said, "Create me two different product ad photos for this image of a cologne bottle. " All right. So, it says, "I've successfully created two different photos. We've got oceanic freshness and we've got minimalist studio elegance. Here is the prompt for the first one. Here's second one. Let's take a look. " Okay. So, this is the oceanic freshness, I'm assuming. And if I pull up real quick the original, you can see that we have the same bottle shape and the same font. So, that is great. And now, if I pull up the second one, you can see that this is just a much cleaner one. It's looks like it's in a studio. And once again, same font and everything like that. This one even added something. It added the text down here with 3. 4. 4 fluid ounces. So, what that tells me is it probably did some Google searching before it made this actual image. Now, the cool thing about these skills is the more and more you use them, the better and better they get. Because what I could do now is I could basically come in here and say, "Cool. So, for image one, I loved this. For image two, I hated this. Update your skill so that in the future that always happens. " And if you guys are curious, the way that I actually built this skill is I had it scrape tons of information about JSON prompting, finding examples and things like that. And then I just had it create like 20 images for me. I said, "Here's what I like about all of them. Here's what I don't like. Give me 20 more. " And then I just did that over and kept giving feedback. So the last thing I wanted to talk about
Pricing & Final Thoughts
here is how are we actually executing these images? Well, I'm using scripts and I'm using Python scripts that make a call to key. Now, what you can do is you can start off in anti-gravity and it can natively generate images for you for free. But you probably will hit your limit pretty quickly. But it's a good way to test it out. And if you remember the price that we saw on Google AI Studio right here, if we go over to key. ai, well, you'll notice that it's actually way cheaper. It's only 4 cents for a 1K resolution image. And then as you go up in resolution, it gets a little bit more expensive. But even for 4K, it is only 9, which is about 40% cheaper than the official price. Now, just wanted to real quick show you guys an example that wasn't so good. Just so I'm not out here just saying that this is the most amazing magic thing in the world. Here's an example where I said, "Create me an infographic about how to make an old-fashioned. " This is the one that I sent to Nanobanana 2 with no JSON prompt. And this is the one that I used my skill to JSON prompt. Now, at first glance, it looks way better. I It's super aesthetic. It's like a top- down view. But what you'll notice is that right here, we have this text that says one sugar cube, which is over here. This probably should have said, you know, mixing spoon or whatever you want to call that. So, it is AI. It still is a black box. It still is basically pulling a lever to slot machine. But with things like skills and JSON prompting, you really can make your outputs way more consistent, and you, the human, can be in a lot more control. So, that's going to do it for this one. Don't forget that you can grab everything that you need in order to do this in my free school in the skills section. The link is in the description. And if you love nerding out about this kind of stuff, then you should definitely check out my paid community. The link for that is also in the description. We've got over 3,000 members in here who are building with AI every day, and they're all building businesses around AI automation. So, if that's what your goal is, then it's a great place to surround yourself with like-minded individuals. But that is going to do it for today. So if you guys enjoyed or you learned something new, please give it a like. Definitely helps me out a ton. And as always, I appreciate you guys making it to the end of the video. I'll see you on the next one. Thanks everyone.