OpenAI introduces ChatGPT Agents, a major upgrade that gives ChatGPT the ability to take real actions on your behalf. Agents can browse the web, use APIs, write and run code, and interact with files to complete complex tasks end to end.
This video explains how the new Agent system works, what tools are included, how it handles tasks securely, and what this means for the future of AI automation.
Learn more: https://openai.com/index/introducing-chatgpt-agent/
🚀 Learn how to use AI to grow your business. Access 20+ expert courses & community—free for 7 days: https://bit.ly/skill-leap
Оглавление (5 сегментов)
Segment 1 (00:00 - 05:00)
Chat GPT introduced AI agents about a week ago and AI agents are basically intended to do things for you on the internet. So they could take over a virtual computer. They could think, they could reason, and they could do deep research and then combine that with the power of the agent to actually click things, fill out boxes. All of that sounded good. So, I spent about a week trying to make a video to show you all kinds of different useful ways to use chat GPT agents. And I got to be honest, after trying all kinds of different things, sitting through all kinds of long waits, I only have maybe a couple of different things that I could recommend where chat GPT agents make sense. Right? So, the whole point of this channel is I show you practical applications for using different AI tools. Chat GPT agents might be one of the most disappointing updates that I've ever seen come out of chat GPT. They usually really blow me away by their updates. This is not one of them. There's a couple of things that's interesting about it, but let me demo it for you and then you could judge for yourself. Okay, so if you have a Chat GPT pro account, you definitely have AI agents, but couple of days ago, they started rolling it out to plus and teams members, too. So, if you have a paid account, there's a good chance that if you click on tools, you'll see this agent mode over here. Now, here's one of my favorite parts of this agent mode. You usually, if you haven't used any type of AI agent platform, you're not really sure what AI agents can actually do. I've made ton of videos about AI agents. I cover ton of different useful AI agent platforms. NHN, Lindy, Make, Zapier, there's ton of other ones, too. But let's see what AI agents inside of Chat GPT could do. So they broke it down into these five categories, right? The suggested categories just give you some options over here. It could also create reports for you. It could take actions for you on websites. Like for example, this one says, "Review and prioritize my inbox after vacation. " They could create spreadsheets for you. So that's another thing an AI agent could do. And they could do presentations and give you a PowerPoint output. Okay. So, I'm going to show you some examples. I ran a ton of different examples, some of these suggestions, some on my own. So, just to simplify what agents are in this case and how they came together. Chat GPT, if I click over here, they have something called deep research. Now, deep research inside of ChatGpt and other AI apps like Google Gemini have this too, is one of my all-time favorite AI updates. He goes and searches dozens, sometimes hundreds of websites and it gives you a really solid answer after 10, 12 minutes sometimes, right? Great. In text format. Well, then they had something called chat GPT operator which I covered in a different video. I was also very disappointed in that too. That gave it computer use. It could go and launch a browser on chat GPT and you could input things. you could take over, but it would by itself go do things for you, find flights for you or find a pair of pants for you and things like that. Well, agent mode combines deep research, which you could still use by itself, and operator, which just got rolled into this agent mode. Okay, so now agent mode could think and it could reason and it could do deep research and it could take actions, right? That's the promise at least right now. So, let's go through some of these examples. Now, this one is from one of the suggestions down here. It's basically asking to research the process of forming an LLC, right? A lot of people use this for work and business. So, I'm going to give you a lot of business use cases. For some reason, in their demo, they had a very personal use case for finding you like a pair of pants for a wedding, which uh I think I could do that without an agent right now. So, it was a little bit glorified of an example. They waited like 15 minutes for it to not find a good pair of pants. But, okay, let's say this is the example and I have the agent selected here. I'm going to go ahead and send this out. Now, this works a little bit differently than anything you've seen. So, it's thinking right now. Then, it sets up my desktop. So, it sets up a computer, a virtual computer that it could launch a browser on, and it will launch a browser in the background, and it will actually go do things for you. So, you could kind of see what it's doing. This is kind of the preview of what it's doing. In this case, it looked like it pretty much googled what I gave it, and it's going to this reading mode. So, it's reading different pages and it's trying to come up with an answer. And then when this is done, you could actually review all the different steps that it took. If you press the three dots at any time, you could stop it or you could take over. So, a lot of times it's going to ask you to take over to log into something, for example, or enter payment information. We're going to get into that in a second. You could take over just using this option and it will give you a popup. So, you could manually take over and then you use this browser. I'll show you an example in a second. Okay. So after about six minutes, this is the output that he gave
Segment 2 (05:00 - 10:00)
me in text format. This is the output and it pulled all this information from Legal Zoom and it just kind of gave me this formatting of here's a bunch of different states you could do this in and it created a spreadsheet for me, right? And all the sources came from one website because my prompt specifically said to look for Legal Zoom. If I didn't say that, it would do the deep research format. Now, here's the funny part about this, right? That seems like okay that could be cool. I went to chat GPT just a regular chat GPT. I used the same exact prompt. Okay, I'm going to use it right now. Doing a web search. I didn't do anything else, right? It took two seconds. And let me just go down. Look at this. It created this table for me with all the information I need in two seconds. Right, the agent took six minutes. Did not format it at all in a way I wanted. And I basically had to figure out how to format the spreadsheet to look something like this. This is with zero follow-up prompt. So you see my point here. It's just like a glorified version of what chatt could do with just like a nice UI and I could see what it's doing with his search and with clicking around on different websites. This is already doing that almost instantly. So that is one example of why I did not think this was worth the upgrade at all. Okay, the next example I wanted to show you is under the actions, right? So, with actions, you could book things. You could order banana cream pie ingredients on Instacart. You could order food from Door Dash. And these are all the different options that he gave you as a suggestion, but pretty much anything you want, you could go ahead and paste that as a prompt. So, book a four-star hotel from hotels. com. Now, I use this because it narrows it down even more. If you take out hotel. com, it takes even longer. I told her the date. I told her how much I want to spend. I want to have indoor pool, right? This is actually one of the prompts they use then as example, too. So, let's send this out. This is a little bit more interesting because now it's going to be able to do search and the action that it takes to book the hotel. Right? This is what the operator was supposed to do before it got turned into chat GPT agent. Okay? So, it went to hotels. com on a Chrome tab. And right here, it actually shows you what it's doing. It's kind of showing you how it's thinking through it. So, I do like that a lot of the times when you use kind of the step-by-step thinking mode and you could kind of expand that out to get a overview of how it's thinking in the background. And I ran this one across multiple different things like this, like ordering things online and anything that required a specific website search. But I'll show you what we come up with. Okay, so 25 minutes later, I got a hotel that he found for me. And here's a link. And this is where the link takes me. 25 minutes it took that. I went on hotel. com myself, did the exact same thing in exactly 45 seconds, by the way, just using the filters on the left side. So, I really don't understand why anyone in their right mind would do this on tasks that take like a minute to do. And then at the end of it, it says, "I got this page for you. " But again, the link did not work. But literally, these are all the different steps that you had to go through after 25 minutes to get to this checkout page. Now, at this point, I have to manually take over, right? So, I could just take over the browser here and I could then put in my payment information, which brings up this next point. Okay? And I'll show you with another example, too. You see this right here? This may put your data at risk. Signing Chat GPT into websites can expose your data to malicious sites. Well, I mean, I usually am the first person to test out new things. I'm all about that. But that's kind of crazy. Why would I go put in my credit card information and then give this agent access to it if it gives me this kind of warning, right? This is a virtual browser. This is not literally on my computer. This is the browser that's on their computer and it's highly delayed. I'm going to just going to press continue to show you this. At least every time I've used it, this is not restoring the computer because this is a little bit of an older chat here. Okay. And this one, I guess, is not able to restore this older chat. So, I'll let this one go. I'll do it on the next one, though. I'm going to take over the computer. I got a good example on the next one. Okay. Here's a really cool use case. Do an audit of my Google calendar for the last six months and tell me how I've spent my time. Right? It could do things with your calendar, with your Gmail, pretty much anything you could give it access to. It could do that. But let me show you what ends up happening here. Okay? Okay, it says, "I'll start by accessing your
Segment 3 (10:00 - 15:00)
Google calendar to review your events for the last six months, categorize it based on time spent, and I'll give you a clear summary of key takeaways, right? Most of us could use something like this, but this is what you have to do in order for this to work. " Okay, in my very first attempt, it went toample. com and it says I ran into a snag and it wasn't able to do it. So, we'll go ahead and try again. Now, when it did work, this is what it was able to give me here. So, it did a kind of a Google search to try to figure out how to get to Google Calendar for a while. And then when it did find it, it looked up how to export your Google calendar again using a Google search. Okay? And then it brings you to this page where you need to take over. So, it gives you an option here. And then you could also hover over your screen here and take over from here. Okay. For some reason, when I took over, it just redirected to Wikipedia for some reason. But I could come up here and go to Google and press sign in, I guess. Okay. Now, I have to trust this and give it my Google credential to log into my Google account. Now, I don't know about you, but I have everything tied into my Google account, right? Okay. Well, I guess for this example, I'll give it a try. But remember the big popup that he gave us that this could actually put your data at risk. So I would not recommend anyone do this part. This is the most useful part because it connects to your own data. But the risk is kind of not worth it. Okay. So I did log in. I'm going to say finish controlling. Now let's see if this figures out how to redirect because it took us out of Google and to Wikipedia. So I manually went to Google. Let's see if it could figure out how to get there again. Okay, it's inside of my Google calendar now. It's going to the settings tab over here. And let's see if it could figure it out. Now, I just gave it one of my other not main accounts here again, just for security reasons, but it is a Google account I do use. Okay, this is interesting. It actually load up a terminal and it's parsing a zip file that it got from Google Calendar. So, let's see what it comes up with after it's done. Okay, so I'm going to blur out some of this information here, but it looks like it did do a good job and for the most part, it did put it inside of different categories and it did find the right date range and pretty much all the events I had on this one specific calendar. And it did give me a good breakdown and suggestions on exactly what I need to do. So, couple of snags in the beginning of it, but this so far is one of the better use cases that I found that actually worked, right? This is something that would be kind of hard for me to even figure out how to do. It was one of the suggested prompts and everyone's trying to optimize their time. Great way to go and look at your time for the last 3 months, 6 months, see where you spend it if you do use a calendar like this and get a big picture overview where you could save more time. Okay, let me show you the presentation option because pretty much all of us have to at some point make a presentation and having chat GPT agent do it for you. That's great. Okay, this prompt is good. Develop a go-to market presentation for luxury hospitality brand entering a new international market and then it gives it a little bit more context on what we're looking for. So, let me send this out. Now, you could already do this with chat GPT and ask for a PowerPoint and combine it with deep research. So the agent part I'm really not sure where it comes to play because I did exactly the same thing using deep research without agent and I got a very similar result. So I'll just show you the results here in a minute. Okay, so it took about 15 minutes and that's not too long. Deep research sometimes takes about that long anyway, but it did go through it using this kind of visual interface too. This is what I got. So this is the presentation. Let me just make it full screen here so you see what it looks like. So, typical PowerPoint presentation and the information is pretty good. The design is pretty bad though. And you could actually play it as a slideshow if you click this. So, it looks like a real nice PowerPoint presentation, right? Just not with good design. But I'm still going to count this as one of the useful options of just Chat GPT in general. Even though it doesn't really know how to design good slides, it does the research for you. But I can't give credit to AI agent mode because I just did the exact same thing with deep research. Now, here's where this gets really interesting and this is one of my favorite workflows even with deep research. I usually create my slideshows like that. I then edit them in PowerPoint and reexport them or save them out again inside of PowerPoint so it has my finishing touch to it. Then I bring it inside of gamma. app and I use this option right here, import a file.
Segment 4 (15:00 - 20:00)
This lets me import that PowerPoint which I got directly out of chat GPT. ChatgPT does create this PPTX PowerPoint format and you have two options. Okay, if you don't really wanted to change the PowerPoint all that much, so keep kind of the visual design but make it a lot more professional looking, this is a good option. Visual import, polishing your slides, preserving text and images, and adapting the layout to gamma format. I'm going to show you this next, but this is my favorite one, transform content. But a lot of people don't want to change it that much, and this ends up being too much for them. So, the visual import option lets you pick a theme. Let's say I just picked this theme. I'm just going to press continue, and it's going to put the original slide and the new slide side by side here. And it's I'm just doing this in real time without the edit. This is the slide out of chat GPT from the chat GPT agent. And here's the slide it's going to make for you. And it's going to do that with the entire slideshow. Now, depending on the plan, if you have the free plan, you're going to be limited on how many slides. But with the paid plan, you could get a lot more slides here. So, you could see same kind of layout, just looks a whole lot better, right? And it looks like you actually did not import the bar chart and recreated it in a different way, right? So sometimes it's going to have issues with this option that you might have to redo. But overall, yeah, this one it pulled in, made the table again. And this is in gamma format and it's happening pretty quickly. Couple minutes, right? And you could go ahead and continue this. You could publish this on the web. You could download it as a PowerPoint. You could also present it this way just directly from this page over here. You could delete slides and everything is customizable. And you could change images. So that's great. But let me show you that other option inside of Gamma. If you choose transform content, this gets a lot more interesting. Let me just press continue here. As a presentation, it lets you decide if you want minimal, detailed, concise, or extensive type. You could give it more information here. You could choose image styles here. It's going to create AI generated images. I'm going to press generate here to show you what we get. Now, it creates an entirely new slideshow for you. Okay. So, the title screen looks about the same. But these other screens usually look a whole lot better. So, I'm going to let this finish up. And if it's too much text for you on the screen, again, I recommend the more concise options. It won't do as much writing as this one does. And if I just kind of scroll through now that it's done, you could see it's created these images for us. A lot of times these are AI generated, but you can import your own. This uses some of the top AI image models inside of Gamma here. So really nice results that we got out of this one. Okay. So if you're willing to experiment, you could see there's couple of things here that we can use AI agents for. But as I mentioned, a lot of that is just from deep research. So you don't really need the agent part. The agent part comes in with the operator section of it where it takes over the computer and it could input things. But you have to be okay with the issues when it comes to mailware and whatever they warned us about in that popup. Let me show you what Sam Alman said in their own demo. And this is why I'm just not willing to give a credit card information and let it buy things for me or even log into things like my main Google account. — As Casey mentioned, although this is an extremely exciting new technology, there are new risks. uh people learned how to use the internet generally pretty safely, although of course there are still scammers and other attacks. People are going to need to learn to use AI agents uh and societyy's going to need to learn to build up defenses against attacks on AI agents as well. So we're starting with a very robust system, lots of warnings. We will relax that over time as people get more comfortable with it. But we do want people to treat this as a new technology and a new risk surface and use all of the caution that Casey talked about. — Okay. I've been covering Chachi PT since December of 2022. And I got to tell you, this is one of the most disappointing upgrades. And then on top of that, they rolled this out into a $200 plan that a lot of people got just for this option. And it's the most beta product that I've ever seen with potential security risk. So, I just feel like these type of things when they come out, you're really excited about it and then sometimes they get really hyped up and then you sign up for a $200 plan or even the $20 or the $30 plan, right? You pay for it thinking it's going to change your workflow and this is literally done nothing for me but waste about a week of time trying to figure out useful ways to use it. So, again, I like to keep these videos as transparent as I can. When I find something I really like, I like to make a video about it and I'm usually overly excited about it. A lot of times those are chat GPT updates. This one is not one of them and I wanted to make this video. So you see it in action without me editing anything out. I just showed you the result that I'm getting out of the agent mode and this is what
Segment 5 (20:00 - 20:00)
I'm getting. So I'm sure they're going to make it better. Operator, by the way, never got better. It just kind of got rebranded to AI agent. So, that was also part of the $200 plan that was kind of doing that uh takeover mode. Didn't really work. So, I get why they have to release things in kind of beta mode, but this is just I don't know. Anyway, let me know what you think of it in the comments section. Hopefully, this sheds some light on it for you. I'll see you in the next video.