Claude Code vs Cursor (GPT-5): Which should you choose?

19:15

Claude Code vs Cursor (GPT-5): Which should you choose?

Alex Finn 12.08.2025 7 466 просмотров 118 лайков обн. 18.02.2026

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

In this video I put Cursor w/ GPT 5 up against Claude Code to determine if Claude Code is still the AI coding king. We use the World Famous Patented Alex Finn AI Coding Benchmark (WFPAFACB) to see which tool is worth your money Follow my X: https://x.com/AlexFinnX Sign up for my free newsletter: https://www.alexfinn.ai/subscribe My $300k/yr AI app: https://www.creatorbuddy.io/ 0:00 Intro 0:31 The Benchmark 1:15 FPS Benchmark Cursor 3:50 FPS benchmark Claude Code 5:44 Elon dance benchmark Cursor 7:33 Elon dance benchmark Claude Code 9:12 City flythrough Cursor 10:50 City flythrough Claude Code 13:05 Music Visualizer Cursor 14:32 Music Visualizer Claude Code 16:02 Winner of the benchmark

Оглавление (11 сегментов)

Intro

Chad GPT5 just came out and it is the most polarizing AI model to ever release. Half of people are saying it's the greatest AI model they've ever used. The other half is saying it's absolute dog water. In this video, I'm going to put it up against Claude Code with a world famous benchmarking test to see just how good it is and see if you should be keeping or getting rid of that $200 Claude Code subscription. Is GPT5 the biggest bust of all time? And is Clawude Code still king? Let's go. So

The Benchmark

we are going to put GPT5 and Claude Code through the paces. We're going to put them through the world famous Alex Finn AI model benchmarking test. This benchmark test consists of four programming tests. We're going to build a 3D firstperson shooter, an Elon dancing animation, a 3D city flyth through, and a music visualizer. After we go through these four intensive, mind-blowing world famous tests, we will know which of the models is better and if you should stick with Claude Code or if it's time to switch to GPT5 and the King's been dethroned. I'm also going to include all these prompts down below. So, if you want to take the world famous Alex Finn benchmark test and do it yourself, you can do it as well. So, the

FPS Benchmark Cursor

first benchmark test is building a 3D firsterson shooter. I'm using this prompt here. I love this prompt in the Alex Finn patented benchmark test because it gives the model a ton of room for creativity. It allows it to come up with its own ideas for this 3D first-person shooter. Basically, what it does, it tells it to build a 3D first-person shooter using 3JS in a single HTML file. If you want to create any video games at home, highly recommend 3JS. Great, easy to use library for AI. It tells it to add any mechanics, power-ups, and enemies that it thinks will make the game more fun and beautiful. So, we're doing GPT5 inside of Cursor. cursor came out on launch day and basically said this is the greatest model of all time. So let's see if maybe OpenAI slip them a couple dollars or if they actually believe it. We'll see how good this turns out. As this is building, it's interesting to see that they brought Cursor on to that announcement video because the whole Windurf deal just fell through. So I wonder if part of them is just giving up on buying their own ID and is just partnering with Cursor here on out. Interesting to see where that goes. Building out the app, it's going to put all index. html. If you're building your own game at home, you want to build a quick test. Putting it into one file, index. hml, makes it super easy to open up if you're doing this at home. By the way, this prompt is down below as well if you want to test it out. All right, looks like it wrote uh 700 lines of code uh which is pretty incredible. Let's run this and see what it looks like. All right, here we go. It is named Neon Drift. I like that name. WD, of course, we got a Wow, it made a time warp power up. I like that. Shoot dash. All right, let's see how this goes. Click to play. I actually like the styling of this. This is very Tron-like. I like the stars in the sky. Okay, here's the enemies. They're just kind of purple squares. I don't mind that. Let's see. Okay, nice particle effects when I destroy them. Uh, at the bottom, I don't know what the purple is. Is the purple kind of my power? I'm not sure. It is a very big stage with enemies coming from each direction. This must be a power up. I click on that. Yep, my health comes back. I mean, it's solid. I do kind of like the Tron vibes, but there's not much going on here. They didn't really add much to the environment. Reason I used this prompt is to give it room for creativity. And while I do like the vibes of the game, it didn't do much creativity when it comes to the enemies or really the difficult or making it that much fun of a game at all. Oh, boom. I died. Okay, it just freezes the game. So, this is decent. From a benchmark perspective, we're going to rate each one of these from 1 to 10. I give it a 6-1 on the first person shooter test. Now, let's go over to Claude Code. See how it performs on this benchmark. All right, we're now

FPS benchmark Claude Code

in Claude Code inside a cursor. I'm going to give it the same exact prompt. I'm going to hit enter. Let's see how Claude Code does here. I've built many games with Claude Code before. It handles 3JS really, really well. So, I'm expecting good things here. I was impressed to a point with the GPT5 from kind of a stylistic vibe perspective. From a total execution perspective, it could use some work. One thing I love about Claude codes, it always adds a little bit extra. more than what you ask for, which I actually like. It seems with GPT, it just added kind of the bare minimum. It didn't really have taste or opinion. I love Claude Code's taste. It always has taste and opinion. Everything always has a little bit of stank on it, which I love. So, let's see how it does with this test. All right, looks like it's all done. A cyberpunk theme FPS with three weapon types, three enemy types, four powerups. Wow. Okay, I have high expectations. All right, neon cyber FPS. It gave a name or everything. Okay. Oh, so we actually can switch weapons here. Let's test this out. All right, so works on first shot here. Okay, so we have flying enemies here. Let's see. I have different weapons. I can switch to a laser. It seems like the only difference between the weapons is the different particles that come out of the gun. Okay, the rocket actually has orange projectiles that come out. I like that. From an environment perspective, I do like it better than the GPT one. Uh I do like that there's like buildings and shapes in here that I can run around. It does have a little bit more flare to it. From an enemy perspective, they all just gave me different floating shapes, just like triangles and squares. So that's fine. Let's go to our benchmark score sheet here. That game I'm probably going to give a solid 72. So, it's a solid 72 for Claude on the first person shooter benchmark. Now, let's go to the next benchmark. A controversial benchmark for sure. This is the Elon dancing benchmark. Let's see which model can create a better animation of Elon Musk dancing. So, I'm back in GPT5 inside a cursor. And the

Elon dance benchmark Cursor

prompt I'm giving it is produce a single standalone HTML document containing an inline SVG. So, we're creating an SVG of an animated Elon Musk dancing. It must use shapes. It must animate the body. So, the arms, the legs, the head must be moving. I like this test because again, it tests the creativity of the model. One, how accurate can it make an Elon Musk animation? Two, how creative can it get with dance moves? Can it make it dance in an interesting ways or is it going to just move up and down? So, it tests again not the instruction following because from an instruction following perspective, every model is pretty much perfect. Now, when it comes to instruction following, for me, AGI is really determined by creativity, thinking outside the box. How much can it be more than just an autocomplete? And so, little things like this, seeing what it thinks humans dance like, or seeing what it how it would animate an Elon Musk from scratch with no reference image, that to me is what the true test of like an AI's power is. So, let's see how it does here. Okay, looks like it finished 80 about 85 lines of code, which I was expecting more. Let's see how this goes. And here we go. Here is the dancing Elon Musk SVG. Uh, few things I notice here. One is the dance is weak. The dance is just the arms and legs kind of swaying back and forth. Two, looks nothing remotely like Elon Musk. I don't know if that's a uni brow or they just messed up the hair. Uh, maybe they gave him n Elon Musk 1995 haircut, which is a little bit less hair than today. I don't know. Uh, doesn't really look like Elon. I think they tried to put him in his suit, which I guess Elon doesn't really wear suits. So, tough outing on the Elon Musk benchmark here. I'm going to have to give this about a 3. 8. Now, let's go over to Claude Code to see how it handles the Elon dancing benchmark. All right, so we're back in Claude Code. I

Elon dance benchmark Claude Code

put in the Elon prompt again. All the prompts, if you want to do these benchmarks for yourself, the world famous Alex Finn benchmark test. It's a benchmark that Sam Alman's talked about. Elon's talked about. It's probably the most famous AI benchmark in the world. The prompts for these benchmarks are down below. All right, looks like the Elon dancing code has been created. Let's see how Claude code stacks up in this benchmark. All right, here we go. Uh, here's Elon Musk dancing. From a dancing perspective, I'd say it's a little bit more advanced. It's going up and down. The head's tilting a little bit. From a hair perspective, it's still more 1999 Elon Musk haircut. We're getting a little bit of baldness. From a dancing perspective, we're a little closer. Let's go back and forth here. So, you know, I'd say from a looks perspective, you know, the head's completely removed from the body with GPT5. Here, it's at least a little bit more connected. I think I got to give Claude a little bit of an edge on the dancing. Like the here, everything's moving so slow in GPT5. Claude, we're moving a little faster. It feels a little bit more like dancing. And also, I'd say the ratio of head to legs to arms is a little bit more accurate on Claude. So, from a dancing benchmark perspective, I think we have to give the edge the Claude code. Let's go back to our spreadsheet. For me, this is more of a 51 a little bit better from a 1 to 10 scale. Listen, I think the dancing SVG perspective benchmark is still the benchmark that needs the most improvement from the models, but Claude Code does edge out GPT5 a little bit here. Let's go to the ne the third of the four Alex Finn world famous benchmarks. The city flyth through benchmark test. So this is

City flythrough Cursor

another interesting way to test how well a model can get creative and think outside of its guard rails is I'm giving it a prompt to build a simulated fly through of a city. It needs to build an entire 3D city and have a camera fly through it as almost like a tour through it. So GBT5 is going to get to work here. It's going to start generating. Again, we're using 3JS, so anything you want to do 3D related, highly recommend using 3JS. AIS work with it very, very well. There's tons of documentation online. If you want a fun weekend project, definitely put together like a 3JS game. I promise you'll really enjoy it. All right, looks like the code completed. Let's see how this city flythrough does. All right, here we go. Uh, this is interesting. I feel like I'm on a roller coaster. Uh, it is going up and down aggressively. I'm getting car sick watching this right now. Uh, from a city perspective, it's tough to see. It's very dark. They didn't really add any sort of dynamic lighting or anything. It is hard to see what I'm seeing. From a camera perspective, it is aggressively going up and down. Like, I'm in an Uber and the Uber driver is just trying to get to the destination as quickly as humanly possible to collect more fairs. Uh, this is a tough one. Let's go to our benchmark spreadsheet here. From a city flyth through perspective, I do like how many buildings there were. There are a ton of buildings in here, which is nice. But from a camera perspective, you really can't see what you're looking at because it's so up and down, bumpy. Uh the FPS frames were nice. So, let's go for this test. We're going to give it a 4. 8. Now, let's go with the city flyth through over on Claude Code. All right. So, we put our prompt into Claude Code.

City flythrough Claude Code

I love this benchmark test because there's so many variables in this one from the camera to the city to the lighting to the buildings to the frame rate. There's so much involved that the model has to juggle. That's why I really like this benchmark. And all four benchmarks in this Alex Finn patented benchmark test. They all have tons of different variables, which I really think is a good way to measure these models, which is probably why Jensen Wong says it's the number one AI benchmark test he trusts. So, let's see how Cloud Code does here. All right, city flythrough's complete. Let's test this out. Let's see how Claude does. Will they be better than chat GPT? All right, so here we go. Here is the city flythrough from Claude. Let's see here. It looks like we're like down in the streets here. This the sky is like this red color. It's kind of like Gotham Batmanesque. I kind of dig it in a way. Uh from a camera perspective, it's upside down. Okay, where we are we going to go back? Oh, look. Did you see the windows going in and out? I kind of like that. the lights turning off and on in the windows. All right, now we're back right side up here. The city looks pretty nice. I kind of like it. Some of the windows are off the buildings. That's a issue. Uh and we also go through the buildings, which isn't great. So, this is interesting. Claude Code can't handle the camera too well either. It's struggling with the camera. It is also kind of making me sick. But from a city perspective, a couple positives here. At least I can see it first of all. Second of all, some of the details are kind of nice, like the windows, the shadows on the buildings. I feel like I'm in like the Robert Patson Batman kind of. I actually dig it a little bit. The biggest issue clearly being the camera making me kind of sick here. Other than that, uh it does have a little bit more detail than GPT5. I just wish it would stop going upside down. Overall, I'm giving that a slight edge to GPT5. I'm giving it a 5. 2. This really isn't comparison scores. I'm trying to do this in a vacuum. I'd say overall that's a 52 out of 10 on in like in a vacuum by itself, but slightly better overall than GPT5. I'd say they both messed up the camera. I'd say the camera's a tad better and claude. And the city was a little bit more detailed in Claude as well. So going into the final benchmark here, GPT5 can do it. They can still pull it off. They're only down two and a half here. So let's see what we do in the music visualizer test. So, this final test, again, another

Music Visualizer Cursor

really nice creativity test where we're going to have the AI build out a music visualizer where it not only has to write the song, it has to write a visualizer around the song. So, how well can it make music and how well could it actually match up visuals to that music as well? So, I'm going to give that to Chad GPT5 inside cursor. Again, prompt down below. You can steal this and run your own benchmarks, too. All right, we have the in browser synth wave here. Start. Let's start the track. Let's see how we go. Oh my god, this is pretty good. This sounds like a professional song. This sounds like it's out of Ryan Gosling's Drive movie. One of the best movies of all time, by the way. This going to be right off the soundtrack. I think it's pretty good. I also really like the visualizer. You can see it's perfectly in sync with the song, too. It's really cool because it was able to not only make this track from scratch, but it also synced up the visuals with the track. So, it had to do a lot at once. So, this is very good by GPT. I'm really impressed. That was cool. I didn't know an AI model right out of the box can do that. I'm going to be honest, that's really good. I'm going to have to give that an 8. 1. That was an 81. That was the most impressive job by GPT5 yet. I really liked that. Now, let's go over to Claude to see how it can do. So, I gave the prompt to Claude Code. I'm gonna be honest, Claude

Music Visualizer Claude Code

Code's gonna have to do a good job here to keep up. That was really impressive by GPT5. By far the most impressive one yet. The fact that I was able to make that song from scratch. Really, really impressive. Again, this is just more kind of advanced creative benchmark. Multiple variables of things it's going to have to come up with, and sync the visuals and the audios, too. So, let's see how this does. All right, let's look at it. It looks like it's done here. All right, here we go. Synth wave. Let's play the synth wave. Oh my god. This is incredible. From a music perspective, I'm no Simon Cowl, but this seems more complex than the GPT5 song. Also, from a visual perspective, it has strengths compared to GPT5, but also has weaknesses. Strengths perspective. strength perspective. There's a whole lot more going on here. The flashing, everything going up and down. Weaknesses, it seems a little scattered and all over the place. This is pretty solid, too. This is that this is going to be a close

Winner of the benchmark

one. I actually think I lean the GPT a little bit more here. I'm going to have to go with a 7. 4 on Claude on the music visualizer test. Final score of the world famous Alex Finn benchmark. Claude code takes it 24. 9 against the 22. 8 GPT5. Now listen, is GPT5 this revolutionary model that everyone was telling you was when it launched? Unfortunately, I think Claude code is still better. I think Claude Opus still takes the crown when it comes to coding, which is yes, what these tests were. There's a lot more to AI models than just straight up coding. I like to do coding tests because I think at its core, coding is logic. Coding is the purest form of logic you can possibly have. So, if you can code really well, that means you have the best logic. So, from that perspective, I have to give it to Claude Opus. I'm not canceling the $200 a month Claude code subscription. But I will say this as a side note, and I've been saying this for months now. From a chatbot perspective, from a companionesque perspective of just wanting to bounce ideas, I do give it to GPT. I've been saying for a while now GPT03 was the best chatbot I've ever used in my entire life. From bouncing ideas to getting advice, 03 was the best. And I do believe GPT5 is a slight step up. the GBT5 thinking model slight step up over 03. So if I'm doing straight idea bouncing, business planning, which I do a lot, I do spend at least an hour a day just straight up business planning with the AI models, I am using GPT5. But from a coding perspective, which for me, and probably if you're watching this channel, because this is mostly an AI coding channel, uh is you know what you do 90% of AI with, Claude Code still takes it. It does still have a slight edge. And I know GPT5 is significantly cheaper than Claude Code. significantly cheaper, right? It's like a 12th the price of Claude Opus. But I will say this, even for this 10% improvement with Claude Code, I'm still willing to pay $200 a month. That 10% improvement will save you hours in the long run. So, I still am a big GPT guy when it comes to everything except for coding. I don't really use Claude for bouncing business ideas. I didn't like the creativity ideas it came up with from a chatbot perspective. But from a coding perspective, Claude Code's still king. It still has better taste. It still outputs less buggy code. It still just feels better the code and the output that the software it builds. So, I'm sticking with Claw Code. Make sure to subscribe. Every new model comes out, I'm putting it through the world famous patented Alex Fin AI benchmarking test. All of the prompts for the world famous benchmark is down below. So, feel free to steal my benchmark if you want. I probably won't sue you for stealing it. With that being said, leave a like if you learned anything at all, and let me know which model you like better down below. Do you like GPT5 better? You like Claude Code better? You bought into the hype? Let me know down below. and I'll see you in the next

Другие видео автора — Alex Finn

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник