GPT-5.1 Takes Over Opus 4.1 in Coding?

9:18

GPT-5.1 Takes Over Opus 4.1 in Coding?

Ray Amjad 14.11.2025 5 072 просмотров 73 лайков обн. 18.02.2026

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Level up with my Claude Code Masterclass 👉 https://www.masterclaudecode.com/ Learn the AI I'm learning with my newsletter 👉 https://newsletter.rayamjad.com/ Got any questions? DM me on Instagram 👉 https://www.instagram.com/theramjad/ 🎙️ Sign up to the HyperWhisper Windows Waitlist 👉 https://forms.gle/yCuqmEUrfKKnd6sN7 Since I've never accepted a sponsor, my videos are made possible by... —— MY CLASSES —— 🚀 Claude Code Masterclass: https://www.masterclaudecode.com/?utm_source=youtube&utm_campaign=SdeVQ0YyMvc - Use coupon code YEAR2026 for 35% off —— MY APPS —— 🎙️ HyperWhisper, write 5x faster with your voice: https://www.hyperwhisper.com/?utm_source=youtube&utm_campaign=SdeVQ0YyMvc - Use coupon code YEAR2026 for 35% off 📲 Tensor AI: Never Miss the AI News - on iOS: https://apps.apple.com/us/app/ai-news-tensor-ai/id6746403746 - on Android: https://play.google.com/store/apps/details?id=app.tensorai.tensorai - 100% FREE 📹 VidTempla, Manage YouTube Descriptions at Scale: http://vidtempla.com/?utm_source=youtube&utm_campaign=SdeVQ0YyMvc 💬 AgentStack, AI agents for customer support and sales: https://www.agentstack.build/?utm_source=youtube&utm_campaign=SdeVQ0YyMvc - Request private beta by emailing r@rayamjad.com ————— CONNECT WITH ME 🐦 X: https://x.com/@theramjad 👥 LinkedIn: https://www.linkedin.com/in/rayamjad/ 📸 Instagram: https://www.instagram.com/theramjad/ 🌍 My website/blog: https://www.rayamjad.com/ ————— Links: - https://openai.com/index/gpt-5-1-for-developers/ - https://www.anthropic.com/news/claude-sonnet-4-5 - https://x.com/xeophon_/status/1989043950695641166?s=12 - https://x.com/embirico/status/1989081059041116598?s=12 - https://openrouter.ai/openai/gpt-5-codex - https://openrouter.ai/openai/gpt-5.1-codex - https://openrouter.ai/openai/gpt-5.1-codex-mini - https://x.com/coderabbitai/status/1989035006774354387?s=12 - https://www.coderabbit.ai/blog/gpt-51-for-code-related-tasks-higher-signal-at-lower-volume Timestamps: 00:00 - Intro 01:24 - Speeds & Mini Models 02:01 - Code Review 02:45 - My Tests 03:13 - Designs 05:16 - Making an iOS App

Оглавление (6 сегментов)

Intro

Okay, so GPT-5. 1 came out in Codex CLI early today, and we'll be going through it in this video and seeing how it compares to GPT-5 when it comes to vibe-coding. We'll start off by going through the release notes and what some people online are saying about it, and then go through my own experience when it comes to vibe-coding with it. Okay, so they say they made GPT-5. 1 faster because they overhauled the way they trained it to think, which means that it now spends less time on easy tasks and more time on harder tasks. So you can see here with GPT-5. 1 medium, it generates less tokens per response for all these tasks. And finally, you can see that for the SME-bench Verified GPT-5. 1 High here gets a better score of 76% compared to GPT-5 High which got a score of 73%. And now that is slightly behind Claude Sonnet 4. 5 which does get a score of 77. 2% and GPT-5. 1 High now beats Opus 4. 1. But for some reason they did not actually include the GPT-5. 1 Codex scores, they just included the GPT-5. 1 scores. Anyways, the GPT-5. 1 Codex models did land in Codex and they say that GPT-5. 1 Codex is better for Codex mac and Linux and GPT-5. 1 is better for Codex on Windows. And they say that's because the GPT-5. 1 Codex models are not as good at PowerShell as GPT-5. 1 is, which means that if you are using PowerShell and not something like Windows Subsystem for Linux, then you should probably use GPT-5. 1 instead. And that's because the environment that the Codex models are reinforcement learned in are actually in bash. Anyways, one thing that I did notice when doing my testing between GPT-5 Codex and GPT-5. 1

Speeds & Mini Models

Codex is that GPT-5 Codex has become slower now because they're probably phasing out the model. You can see on OpenRouter it was performing at around 25 to 30 tokens per second, and now it's dropped down to about 15-16 tokens a second. Meanwhile, GPT-5. 1 Codex has been operating at around 30 tokens a second for most of today. And GPT-5. 1 Codex Mini is operating around 57 tokens a second, which means that it should feel twice as fast. And they do say it's four times more cost efficient, which means that if you are using it, GPT-5. 1 Codex Mini, then you won't hit your rate limits as soon. And now, as for the

Code Review

final bit of online verdict, CodeRabbit, they say that GPT-5. 1 isn't GPT-5, but just faster. They found in their own evals of the model. It's the highest precision model they have ever tested for code related tasks like code review. So they noticed the precision increase is between 7 and 50%, which is like a pretty big range. They say compared to GPT-5 Codex and Sonnet 4. 5, GPT-5. 1's comments feel leaner and more conversational and closer to how experienced engineers actually communicate. They say that GPT-5. 1 also follows context better because when prompts are vague, then it explicitly explains its assumptions. And then the CodeRabbit team basically explain a bunch more things, including where GPT-5. 1 still lags behind. So basically you got GPT-5,

My Tests

5. 1, 5 codex, and 5. 1 codex to do a bunch of different things. The first one was making an iOS version of my application, HyperWhisper, and this is my speech-to-text application. There will be a coupon code down below if you are interested in buying it. And yeah, basically like a lot of people asked for the iOS application, so I figured it was worth trying to get one started. And then I got it to do some landing page designs because I'm pretty interested in how the models have evolved when it comes to designing things.

Designs

So this is a design that GPT-5 came up with when it comes to introducing itself. The prompt was the same, so I said, please introduce yourself, the model, in a complete landing page. So it has the features section, it has a purple theme, it has this live demo, which may actually work. And yeah, it does seem to respond with something. Oh, it turns out it actually responds with the same thing every single time. Anyways, then this is GPT-5 again. I gave a new session the exact same prompt, ignoring the previous context. This actually seems slightly longer, it still has a very purple-esque design and has the components of landing page. It actually does not seem to have a footer this time, which is interesting. And then I gave the same prompt to GPT-5. 1, and then this is a page that I came up with. Interestingly enough, it actually did not use any purple. It started using green instead this time. It seems to be longer, which is pretty good. The FAQ also does work. It also added a footer and it also added the call to action at the bottom. So it does seem more complete compared to GPT-5. And this is another attempt as well. And honestly, like, I think this may be the best out of all the landing pages. It is slightly messed up in the top over here, but it does feel more like a landing page. I think it's actually because of this right hand bar over here. And now let me actually show you GPT-5. 1 Codex as well. So this is GPT-5. 1 Codex. I do like the color scheme here is better, not the purple design, but it does seem a little too compact for my liking. Like, the padding is kind of off. And then this is GPT-5. 1 Codex again in a different chat. And this, like, nav bar is totally messed up. It didn't make it properly, but the padding here actually does seem better, so I think that's a good thing. It did add the footer as well in this case and also in the previous case as well. So honestly it seems that GPT-5. 1 is slightly better when it comes to design. I would probably stick to 5. 1 myself instead of 5. 1 Codex. Anyways, I will be trying out more when it comes to design over the coming days and then sharing my thoughts in my AI startup school that you can join using the link in description down below. A bunch of people have already joined and found some value out of it, so maybe you will too.

Making an iOS App

As for a much more complicated task, I asked it to make an iOS version on application HyperWhisper. So I came up with a somewhat detailed plan about how everything should look. And I read through the plan and I was like, okay, this looks good. And then I gave this plan to GPT-5 Codex medium thinking and GPT-5. 1 Codex medium thinking. And then I made some notes for myself here. So I would say that GPT-5. 1 Codex is faster because it took 13 minutes instead of 30 minutes compared to GPT-5 Codex. And when I asked both models, are you done? Then they basically said, yeah, we're done. And I think part of this performance difference is the fact they have slowed down GPT-5 Codex and made it half as slow as it used to be. So maybe they would have taken roughly the same time before. And then what I did is I got Claude Code to look at both of them and compare the solutions to see which one is better. So even though it says GPT-5 here, this actually refers to GPT-5 Codex instead. And then after Sonnet 4. 5 looks through both of the code bases, it said that GPT-5 Codex is more advanced and feature complete with better modularity, a more modern Swift implementation, and then GPT-5. 1 codex did not do as good of a job because it just kind of like laid out some boilerplate instead for me to use. So you can see when it comes to feature capabilities, some of the features are actually missing with GPT-5 Codex. And yeah, honestly, it seems kind of disappointing. I guess it is using less tokens per turn, but that also does mean that any implementation that it does do may be less complete because maybe they're training it to do less tokens per turn rather than checking whether its implementation has been completed. But I actually did use GPT-5. 1 Codex instead, so maybe it would have done a better job if I just used 5. 1 like 5. 1 instead. Okay, this is a version that GPT-5. 1 Codex made. So there's like a license key page and this is what it looks like. So for some reason the screen resolution is messed up, but I think this is a simulator issue. So it had the history thing, the settings kind of added to my keyboard. And honestly, I told it to make the design kind of like the desktop application, so it did a pretty bad job at that. And when I press the button, then it just crashes instead of asking me to allow microphone recording permissions. So yeah, let's actually see what GPT-5 Codex did instead. Okay, now this is a version that GPT-5 Codex made. So you can see there's a record page, a history page, the design is kind of messed up here. And then the settings page has a vocabulary manager too, and you enter your license key. There's no validation system. Or actually there is maybe because it says the key is wrong. So there's also keyboard instructions, open new, open settings as well, and then tap to record doesn't actually do anything. So I think making a Mac OS desktop application into a mobile application is like very difficult to do in one session and I'd have to like iterate with the model back and forth overall. But yeah, honestly, I think both of them did not do as good of a job as I hoped it would. GPT-5 Codex did manage to go further than GPT-5. 1 Codex. So I probably will use this particular thing as a starting point and then continue to iterate on it until it's like a working application and then finally publish it to iOS App Store. I think my conclusion here probably is that GPT-5. 1 Codex does seem more conversational and it does seem to be better at design. If you want even better design, then you would want to go for GPT-5. 1 instead. But I think GPT-5. 1 Codex does seem to produce less tokens per turn. So if you're getting it to try and do a really big task perhaps, then it probably won't do as much of the task as you would have liked. And that's why you can see the implementation for GPT-5. 1 Codex being worse than the GPT-5 Codex implementation. But anyways if you do want to buy the MacOS version of HyperWhisper then it will be linked down below with a coupon code as well for Black Friday and if you do want to learn more about Codex CLI and Claude Code then I do have masterclasses on both of them linked down below where I go through every single feature as well as a bunch of bonus content and finally I will be sharing more of my thoughts on GPT-5. 1 and GPT-5. 1 Codex within my community as well so if you are interested in that then there will be a link down below for this.

Другие видео автора — Ray Amjad

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник