Codex now works directly in Chrome on macOS and Windows.
It’s even better at working with apps and sites in Chrome, and now works in parallel across tabs in the background without taking over your browser.
To get started, install the Chrome plugin in the Codex app.
https://x.com/OpenAI/status/2052480800004956323
❤️ If you want to support the channel ❤️
Support here:
Patreon - https://www.patreon.com/1littlecoder/
Ko-Fi - https://ko-fi.com/1littlecoder
🧭 Follow me on 🧭
Twitter - https://twitter.com/1littlecoder
Оглавление (3 сегментов)
Segment 1 (00:00 - 05:00)
Open AI's latest Codex update with Chrome plugin makes it the king of browser automation. I'll give you an example and you will know why I say that. So, I gave a prompt, a very simple prompt to say, "Use Chrome as a plugin to compose a latest AI news tweet. My profile is open like on x. com, based on the top articles from Hacker News. " So, what it managed to do is it worked for 2 minutes. It first went created a group tab group, opened Hacker News. When it opened Hacker News, it kind of like read everything on Hacker News. Then it picked an article. Then it made sure that it went to the link to understand what is in that article. Then it composed the tweet and then came back to me and asked me permission if it can tweet. Obviously, I don't want it to tweet. I don't want people to think that I'm using AI to tweet while I'm always personally tweeting it because I'm chronically online. But, the main point here is that this is making browser automation really, really useful for a variety of tasks including, for example, if you want to fill a form, make Excel sheet manipulation, like a lot of different things. So, in this video, I'm going to show you how you can use this. You're going to see a couple of use cases about how you can use a Chrome plugin within Codex and then get certain things done. First thing's first, you have to always update your Codex. So, if you open Codex, you will see check for updates. Click check for updates. If you have the latest version, it wouldn't show like this. If you do not have the latest version, then it's better to update. That's the first thing. The second thing is make sure you've got the latest model. GPT 5. 5 is really a great model. I primarily use it with a medium intelligence for tasks like this. But, if you're actually coding, then you can go to high or extreme high extra high. But, for browser automation with medium, I found really good success. So, the next thing that you can do is go to skills and make sure that you've got browser and Chrome enabled. So, these are two important skills that you need to enable so that you can use or your Codex can use the browser. So, the next thing is very simple. All you have to do is go to new chat and then start invoking Chrome and then do whatever you want. So in this case I'm going to say use Chrome and open three tabs for three best washing machine from amazon. in. So I'm going to just simply give this as a prompt and we're going to see how it is going to get done. So I've enabled this and as you can see once I start it, first step it is going to spend some time in thinking what it has to do and then after it is done it most likely is going to get started with the task at hand. So we'll see the browser tabs getting open. Hopefully it is in this particular window and you can see it is already connecting to Chrome. So it says I'll use the Chrome plugin and connecting to the Chrome, reading the Chrome skill and then it is searching on Amazon. It's asking can I use this? Let us say always allow in this particular case. And it is going to search on Amazon and you can see here the browser tab it has created a group and you can see Codex started debugging this browser and we have the first tab it is searching for Amazon washing machine and Amazon washing machine I mean it it's still doing it extracting the listing. Let's see what I'll I'm going to close the Hacker News so that we can delete the group. Yep, cool. While it is doing the job let us quickly go ahead and then see what all things that you can use Codex for. I think Codex plus Chrome is going to be an extremely popular combination variety of reasons. First of all, you can use Chrome to do things that you definitely have to do manually. For example in this case the prompt example is a review the last week of Codex related posts on community. open. ai where you have to go to this website, summarize the user feedback sentiment of recent launches and also key user issues and then put it on a spreadsheet. So you are trying to make this AI go to two different places. One to the website, collect the details, make the LLM work for you, put everything as like structured format, and then put it on spreadsheet, and it works. The next thing is, you can also ask it to like fill in forms. For example, you can combine two different skills. So, here in this case, Chrome and Gmail. So, you can say, "Check my Gmail for food-related expenses from my recent Portland trip, and add them to my expenses in whatever that you want to do. " So, again, this is a pure browser-based work. So, you would typically hire like a virtual assistant on my executive assistant, and you can get that done with CodeX Chrome extension. Because it can do the repetitive browser work for you. The next thing is, you can also use it while you're doing coding, especially if you're doing front end, and then you want certain tests to be done, then you can go ahead and then see it. And the prompt is like very surprisingly interesting, because you're going to invoke a few sub agents, and then use ask the sub agents to use Chrome, and then get things done. So, now the sub agents can like invoke Chrome, and then within Chrome, they can create different tabs, and then within those tabs, they can create games, and then test the games, and then finally use it. So, uh again, the another thing, very good
Segment 2 (05:00 - 10:00)
thing about this particular thing, rather than having like a sandboxed Chrome um browser, is here you can have a website that is logged in, so you don't have to share the authentication anywhere else, like computer use or with Atlas. So, here literally like you can use uh like for example, I can have the Gmail open, and then ask it to use within my Gmail. This way, I don't have to do any sort of authentication. I think you can achieve the same thing with computer use as well, but if you're using any other browser-based system, then you have to separately authenticate for that, which we don't have to do in this particular case. I think our task has been completed. You can see that there's a tick mark, and then it says done. I opened three amazon. in washing machine products tabs in Chrome. One is LG 7, so you can see the 7 kilo 5 star, Whirlpool 7. 5 kilo, and then Samsung 8 kilo 5 star. I think this is good choice because last few days I've been like uh literally researching about washing machine and because I want a top load, you can see that it has a it has made a decent enough let's say prediction. First of all, I said like I want top three. It did not pick any random brand. It picked Samsung, Whirlpool, and LG. And even within this, it actually picked one with highest ratings. Now, if you go back to Codex, you can see actual thinking process. It says I found three solid pro products and then it is trying to do something did not open, so it even changed the tab with working product. This is great. Now, what I'm going to tell is I'm going to go ahead and then simply give one more prompt. So, in this case, we're going to see um create a new spreadsheet and put these three links and the product name and pricing there. Okay, let's uh I can also invoke the spreadsheet plugin, but I'm not going to do it here. Let's see what it does. Um using Chrome, let me clearly say using Chrome and then do this. So, what we expected to do is we expected to copy these elements from these three different pages, go to spreadsheet a new spreadsheet and then add those information there. So, you can see that it is already looking for the Amazon tabs. So, that is the first task it is doing. It is And you also see that it has used the browser. It is trying to read all those things. So, it says I found the three tabs still open in the Codex group and it is done. And now, it is saying I'm claiming just those product tabs now. So, you can see it is managing to use those product tabs, which is still open. All the three tabs. So, from those three tabs, it is extracting the product information. That is exactly what it is doing. So, the raw extraction is done, but Amazon exposes multiple price fields including MRP. I'm doing quick sanity check on price area, which is again a good thing. Like you can see here on this page. If you see there is price with exchange, price without exchange, there is some service fee, there is like EMI cost. I mean this page is honestly a mess, but you can see that it knows that this page is mess and then it wants to pick the right page and then use the right information. So it is even doing those due diligence and then doing it. I would also encourage you if you're like worried about rate limits, maybe start with like low intelligence, but medium is probably the right aspect. So now that it managed to get all the pricing. In fact like it got the pricing correct. So the LG one is like 17,990, the Samsung one is 19,490, and the Whirlpool one is 18,240. So it managed to pick the right pricing. Despite this page having like thousands of prices, it managed to get the right pricing, which is exactly what I want. I don't want the EMI one, I don't want with exchange, I want without exchange what the right pricing is. So it managed to get the right pricing and then now it is asking me, "Can I do a new spreadsheet? " And even to open the new spreadsheet, it is using the shortcut, which is sheets. new. And you can see here it opens a spreadsheet. The spreadsheet is available here. So it is now going to put this particular information on the spreadsheet. So Codex started debugging this browser and it is thinking right now. So if you want to only access browser, then this is a great plugin. Right now I'm using it with Brave. Like when this feature was launched, I tested it with the Brave. Unfortunately, at that time did not work, but right now as of this recording, it works on Brave as well. But if your default browser is Chrome, it would work on Chrome without any issue at all. So it is inspecting the sheet UI and then it is going to explore one. It's going to paste the table. Let's see if it is Oh, it even managed to add the title. So as you can see here, it managed to add the title. And like you can see live it is clicking. So, Amazon washing machine comparison, Amazon washing machine is the title the what it has pasted. And we are probably going to see it pasting the pricing as well. So, okay, so it is verifying the sheet. The sheet is renamed. The table has been pasted, which is not correct. Let us see if it takes a screenshot. It is going to
Segment 3 (10:00 - 11:00)
see if the if the sheet has the table information. I think this is very interesting to see if it actually manages to figure out that the pricing is not there. Okay, so the title got pasted. It's it says it's a focus slip. Very interesting. I'm clearing this trace cell, resetting the clipboard to the table. Good. So, wow, managed to do it. Cool. It is repasting into A1, and yeah. So, this is typically you would hire somebody to do market research, ask them to put the data on spreadsheet, and then send it back to you. Now, all these things like everything that we did is like few minutes task using Codex and Chrome plugin. And this is just a sample. You can now ask it to like for example, I can ask it to add one of the items to my cart. I can even ask it to fill forms like submit expenses. The probability of getting things done from this is probably 100% at this particular point. Like you can look for flight tickets, a lot of different things that you can do. And primarily I would use this feature primarily because like I said, I don't have to separately authenticate. My tab is already been authenticated, my Google has been logged in, my Twitter and all those things. So, all I have to do is I have to invoke Chrome, and then get this done. Rather than me going into a sandbox environment, and then authenticating with everything. In fact, like I can probably ask it to make the purchase if my Amazon account has a card in like the debit card or credit card associated with the account. I think overall this is a great product. I mean obviously the use case is not for a opening has not launched this for me to buy a washing machine. But, like I said, like the use case is enormous. Like, you can go to skills and then make sure you have got the browser skill, the Chrome skill, and then you can also, like, as and when it is asking you to add certain permissions, you can do it. But, ultimately, I think this is a great feature. Browser automation at its best, thanks to OpenAI Codex and also Chrome. Let me know what you want to use this for, but otherwise, see you in another video. Happy prompting.