Anyway, let's get to the more interesting part here because this is a model that you'll be using and I want to focus on 03 here. I think we talked about the different alternatives and releases and benchmarks enough here. The thing that really matters to me and what I really try to make the point of this channel is how to democratize the access to all of these tools, techniques, workflows. And I haven't seen a tool since deep research that does it as well as free now. Because this is a big point of free isn't just a thinking model. It's a thinking model that has access to tools. What kind of tools? Well, all tools that Chachibd has before we saw limited releases, okay, when 01 preview came out, you couldn't upload images, you couldn't upload files, you couldn't access memories, it could not use the data analysis tool and so on. All of these things work automatically inside of 03. Now, this is a massive deal and we'll see why in the examples that I have lined up here next for you. But before we do that, I want to share one more thing. Look, I made it my life to stay on top of all of this, test all of it, build organization around it, and we teach it at scale to individuals and organizations. At this point, not often does a release stop me in my tracks and maybe rethink the way I use these tools, technology, the way it integrates into my life. As I said in the opener, many people are stating that this is the first release that can be considered AGI. And I think that statement is debatable because if you ask 100 people, you're going to get 100 definitions for AGI. And I'm making these videos to give you my personal take on this. And the conclusion sort of is this model makes me want to rerun every prompt I have ever tried because it has the ability to surprise me with its intelligence and level of insight. And while that might not be the case on every single thing you run, it did happen regularly to me in my first initial look here. So with that being said, let's have a look at the some of the things that I tried here right away. One of them just being this quick prompt from Matt Schumer on Twitter here which says, "Do intensive research on Egor Pogani and give me a massive report on everything you find. " Obviously, you would replace my name with your name here and you could try this for yourself. And what happens here is just so different from anything we've seen except maybe deep research. It goes ahead, it starts thinking, it starts making a plan. Then it uses the web search tool to pull up multiple sites. It pulled up my Instagram, LinkedIn, my ex. It found the company website, the company LinkedIn. We're now going to be posting twice a day on our company LinkedIn. Go check that out if you're on that platform. And it even found my SoundCloud, which I haven't used in years, but fair enough. Then it goes ahead and figures out that, hey, I even made a typo writing my own name. Embarrassing. And it proceeds to think about this, pull up more links, and it just finds like a dozen different LinkedIn posts and other websites, including some of my university, etc. And it comes up with something that Deep Research would always refuse, a profile on me with all the details, and it's quite comprehensive while not being overwhelming, which sometimes is the case with deep research. So, the guardrails on this are lower than what we saw with deep research, and it's quite impressive. It pulls together many relevant things. But let me tell you, there's like six or seven mistakes here that are just flat out wrong. And those are not mistakes that Ofrey made. Those are mistakes that websites that this reference made. For example, it shows that my nationality is Austrian. This is actually not the fact. I lived in Austria for most of my life, but I still have a Slovakian passport. But the great thing is it gives it source right here. And this website here, which I don't know much about. It's called business ABC. It just somehow assumes that my nationality is Austrian. That's why has it in its report. So, it's not perfect, but clearly the big point here is that the guard rails are lower. With deep research, you could have never researched a person. And some of these other insights here actually made me lean back in my term and kind of surprise me myself. It made a comprehensive list of all the different paid products we offer. And it kind of extrapolated the signature content themes that made me think like, yeah, we do we did used to do a lot of AI workflow stacks and workflow videos and do less of these days. We should do more of that. They get less views, but they seem to communicate more value. And then the overarching theme of the channel is productivity hacks. I mean, we're doing this in the form of generative AI tools and workflows, but that's what a lot of people here care about. So, I thought that was great. And then it ran things like the SWAT analysis here in the end. And that's really one of the big points that I see here. It's not that none of this was possible before, but you had to bring in the context yourself. You had to know which prompts to use and then you had to take the time to actually use them to wait for the results. And you think about it, reprompt, things like that. This does all of it for you. You don't need to engage a search function. give it specific websites that you wanted to consider. You don't need to prompt for it to come up with a structure like this that really makes sense here. It just figures all that out for you with a combination of thinking models and tool usage. So obviously you could take this little prompt, replace your name, run this on yourself or on anybody else. Maybe they'll restrict this down the line, but as of now this works and in deep research this never used to work. This was one of the most interesting things that I wanted to try. Now it works in here. But in the second use case that I want to share here, it really shows the depth of the capability. And this one was inspired by Matthew Burman on Twitter here doing something that many people on the internet are trying right now, which is giving it a picture and saying figure out exactly where this person is. I did it on this wholesome picture of myself from the recent Japan trip and Ofrey went off and started thinking about it. What it did is it cropped into the different parts and started analyzing them in detail, translating the Japanese language and interpreting some of these signs. Like for example, here it says the crest has 16 pedals. Although it looks a bit stylized, it zooms in onto different parts and based on the caption and the text, it comes up with the conclusion that this is clearly the Maji Jingu Shinto shrine. I hope I'm pronouncing that correctly. That is located in Shibuyaku, Tokyo, a park in the middle of Tokyo. And here's the clues it followed. And it even states that this is the inner precinct exhibition area, which is exactly right. Then I was like, "Wow. " Okay, so it didn't just get the location right, so it even got the exact area within the quite small shrine right. Impressive. But it did have several clues here like the text and the logo. So I wanted to give it something a bit harder. I took a second picture for my script which had no text, no logos. Matter of fact, it's night and I figured there's no real clues on this one. And I asked the same thing, figure out exactly where this person is. And after about 2 and 1/2 minutes of thinking, but it started by cropping into the various parts, coming up with some assumptions and various locations. This could be maybe Kamakura's Hokuji Temple or the Kaji Temple in Kyoto. It started running some internet searches to find more images that it could then doublech checkck with. And in typical 03 fashion, as you might know from deep research, it didn't just find a website or two. It found 11 ranging from Trip Advisor to specific travel blogs to Reddit to YouTube. Like, yeah, it pulled up a YouTube video in the process to cross reference these snippets of this image. And it noticed things like, hey, it only shows one horizontal rail while the other shrine that I was considering typically has two, but that could be a trick of the vantage point. So, I want to verify it. And then it went ahead and looked for that specific trail that it assumed and again pulled in a bunch of different websites and analyzed images on there to cross reference this and eventually chose me multiple images that match mine and says that no this is not my first assumption but it's a small hillside bamboo walk inside Kodai temple in Kyoto's Higashyama district which is 100% correct. This is not some obvious spot. The main temple is at the bottom. You kind of have to walk up into the gardens and then this is a small passage tucked away in the woods behind the smaller shrine that is behind the big temple. And it even says the person is standing roughly halfway along the illuminated bamboo trail that leads visitors back towards the main hall of Kodai. This is exactly correct and it even gives you the exact coordinates. At this point, I was just blown away by the accuracy of this thing and by the fact that it looked up dozens of websites, looked at the images there and cross referenced them and could identify this. And I told myself, "Okay, okay, okay. I got to give it something even harder. How about something without a person and without a trail that people walk? So, I looked at my camera roll and found this little snap of this beautiful little bird in a river here. And I just figured, how could it know this? I mean, it's just a river with some rocks and a generic looking Japanese building in the back. Well, long story short, it got this right, too. What? But I thought it was really interesting that initially it pulled up Python and wanted to extract the metadata from the image, which I thought of already. And all of these images didn't include that. I screenshotted the original image to make sure that there's no GPS coordinates embedded in this to make this actually challenging. So that didn't end up working. So then again, it did what it did before. It made some assumptions, ran some searches, looked at all the websites, cross referenced this, and after only 43 seconds, it concluded that this elegant gray heron is standing in the Shurikava Canal. And then again, it gave the exact coordinates and a description of where you can find this exact spots with other reference images that look similar to this. And yep, that's the exact location where I took this picture. God damn, how is it this good? At this point, I felt a combination of excitement, but also I was kind of pissed at the tool. I just figured like, how can it be this good? How can I break this thing? So, I just took out my phone at this coffee shop where I was preparing this video earlier today. And I looked out and I was in a beautiful coffee shop overlooking the river of Osaka. And when I looked closely on the other side of the river, there were these rocks that were tied up in a net with some turtles sitting on it. I shot a little video of you so you can visualize what I'm talking about right now. And funnily enough, when I shot the video, a little raven landed there and all the turtles that were sunbathing kind of hopped off the rock. That was kind of a random moment. Anyway, with the telephoto lens on my iPhone, I snapped a picture of this and I just wanted to break it at this point. give it something that it cannot figure out. And yeah, when I ran the prompt after 2 and 1/2 minutes, it actually got it wrong. Boy, it figured that this is in central Tokyo, which is not right. This is in Osaka. So then when I followed up and said it wrong, not in Tokyo, it did several weird things like running Python code on top of the images. And look, it found these signs and it tried to run some filters on top of it to extract the contrast to read what's on them, but the image was just too low resolution for it to read what's on this sign. Uh, side note, if they plug in some of the top AI upscalers right now into this, it might have gotten this right. But after a total of 10 minutes of thinking, it actually figured out that these are the sandbags that line the inner mode of Osaka Joe, the Osaka Castle in Osaka, Japan, which is correct. I mean, it's not exactly next to the castle. It's like a 25minute walk down the canal. But just based on this low resolution image with one full prompt, I got it. I mean, this is just crazy. So, I hope this example shows you the capabilities of this thing. It's not just the fact that the thinking model is just better than anything we've seen before or free, but it's also the fact that it has access to every tool in chat GPT. And while before even some power users might not have took full advantage of the tooling in chat GPT just because you don't think of it or often because one is too lazy. I mean, if you run a search, you kind of have to wait for it. Then you need to take that context prompt on top of it. use the data analysis tool again that took some time. It does all of that for you now and just presents you with the result. So this works really well if you do something that I refer to as goal-based prompting rather than instruction based prompting. So you just tell it where you want to go and it figures out how to get it. That was always a strength of these thinking models. Yet any AI skill that you might have required over the past few years is still relevant here because if you know how to use this individual modalities, you can open up this thought process and reprompt it to do specific things for you if the results are not what you want. And there's still value to prompt engineering because you can do more intricate things like I started trying afterwards here, but I'm still playing and I don't want to spend an hour showing you every single use case that is interesting here. If I find enough, I'll do a separate video as per usual. But let me just say that many of these business use cases, which I now teach in workshops, like this meeting analyzer, just work better than anything before. In some cases, they're just slightly better. In other cases, they're stunning. And just the same as you would have got with a deep research or proper prompting. But one thing remains constant and that's that the bar to get some of these advanced results that even in combination with AI would have took quite a bit of effort to achieve you can now get from a simple oneline prompt and that the guardrails on things like searching for specific people were lowered on this release just unlocks a whole new world of possibilities. If that's agi or not is up for you to decide, but it's certainly impressive. So that's my first look at 03 and then O4 mini and 04 mini high are just versions of it that are faster and then cheaper when you use them for the API. But for most people, 03 will be the one that you want to be really playing with right now. And then for developers, the three different variations of the 4. 1 models are excellent at generating code. And if you need a non-thinking model, that's the new go-to. They will be depreciating the 4. 5 in the API. By the way, that's also kind of interesting.