Kimi K2 vs Claude Code: When Cheaper Isn't Always Better

9:08

Kimi K2 vs Claude Code: When Cheaper Isn't Always Better

Ray Amjad 14.07.2025 25 888 просмотров 380 лайков обн. 18.02.2026

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Join AI Startup School & learn to vibe code and get paying customers for your apps ⤵️ https://www.skool.com/ai-startup-school 📲 Stay up to date on AI with my app Tensor AI - on iOS: https://apps.apple.com/us/app/ai-news-tensor-ai/id6746403746 - on Android: https://play.google.com/store/apps/details?id=app.tensorai.tensorai CONNECT WITH ME 📸 Instagram: https://www.instagram.com/theramjad/ 👨‍💻 LinkedIn: https://www.linkedin.com/in/rayamjad/ 🌍 My website/blog: https://www.rayamjad.com/ ————— Set environment variables: ``` export ANTHROPIC_BASE_URL=https://api.moonshot.ai/anthropic export ANTHROPIC_AUTH_TOKEN=YOUR_MOONSHOT_API ``` Links Mentioned: - Moonshot API Platform: https://platform.moonshot.ai/console/account - Using It with OpenRouter: https://huggingface.co/blog/francesca-petracci/kimi-k2-claude-code Timestamps: 00:00 - Introduction 00:52 - Setting Up Kimi K2 03:02 - Fixing the API Endpoint 03:28 - Testing to Make Sure It Works 03:53 - Coding... 04:50 - Checking the Result 07:11 - Overall Thoughts 08:27 - OpenRouter

Оглавление (8 сегментов)

Introduction

So about 2 days ago, a brand new open source model called Kim K2 was released by a Chinese lab called Moonshot AI. And basically, it seems to be doing pretty well on the benchmarks over here. And there are a lot of like hype posts about it online, like Kim K2 just crushed every industry benchmark or it's like the best open source model and beats Claude Force on it. And basically, a lot of these hype posts are driven by these like benchmark figures over here. And it's no surprise by now that a lot of these companies, they kind of like rig the benchmarks in their favor or they kind of like do some hacky like behind the scenes tactics. So the model ends up doing really well on the benchmarks, but no one actually uses it in production for the production code bases. But in this video, we're going to be comparing against Claude for Sonnet using Cloud Code and also Gny 2. 5 Pro using GM CLI. So in a previous video, I compared Claude for Sonnet using Claude Code to Jimmy CLI over here. In this video, Claude Code did better than Jimmy CLI. So I'm going to give the exact same instructions to it. And one way of doing this is basically that Claude code can now actually support Kimmy. So Kimi, if

Setting Up Kimi K2

you go to this particular repo over here on GitHub and then you click on the English readme which is over here, then what you can do is you can actually set your enthropic base URL and your enthropic API key to use moonshot API instead. So it will use a ki model instead and then you can continue to use cloud code as normal. So actually I went to the moonshot API or moonshot like uh platform page. You can go to platform. oonshot. ai AI go to console and then you make your account by continuing with Google and then you can go to API keys and then make an API key over here and copy it over. And after doing that basically what we want to do is we want to go back to this export over here. Press copy and then go to our terminal. So I actually like using warp terminal over here. So I'm going to bring this over here and I have my app running. So I'm going to replace the uh moonshot API key with the Kimmy key. So, I'm going to make a new key and then just call it like uh YouTube. And this key will be expired by the time you're watching this video. So, don't try and use it. And then I can press enter over here. And now, if I write in claude, then it will ask me, oh, you have a custom API key on your environment. Do you want to use it? I'm going to press up and then press yes. And now it's using the kimi key. And you can see it says it has overrides over here. So, it's overriding the base URL and it's overriding the API key. Um, and basically now we have to top up our accounts because I don't actually have any money on this account. So, I'm going to top it up with $10. And it seems that the pricing is actually quite cheap for this model. I think it's actually cheaper than Claw Force Sonic. And I don't know why, but the Chinese models just seem to be really cheap. So, maybe the government is subsidizing the models or something like that or they are just really efficient or they have like special like Huawei like GPUs or something like that. Um, but basically it's like a trend where these Chinese models just seem to be cheaper than the Western models. And now I just topped up my account and it seems it gave me an extra $5 for topping up my account with $10. And now this API key should be working. So I can just write in who are you to make sure like this model works properly. And now it says generating thinking whatever. And it seems to say invalid authentication. So it's going to try that over again. So after doing some

Fixing the API Endpoint

investigation, it seems the reason was because uh actually the API over here should not be the CN API. It should be the API. So it should be API. shot. ai I over here because I found it on this page. So it seems that this article or this article over here is more meant for like the Chinese audience. Uh and the API keys that work on the Chinese like market don't work on the outside like Chinese market. Anyways, we can press enter over here and then try one more

Testing to Make Sure It Works

time and then say who are you? So it seems that it says it's Cloud Force on it even though the base URL has been overridden to the moonshot API. So I think it's actually because the prompt that cloud code uses is slightly different. So maybe we can check if the requests are actually going here. So we can refresh the page. So if I go to my balance over here, you can see it's slowly falling, which actually means I think this is being used. Anyways, we're going to continue with the command. And

Coding...

the prompt that I'm going to use is exactly the same as this last video 2 weeks ago, which is about gem CLI and cloud code. And basically this prompt says, can you number one add a system in the expert application that allows me to swipe left and right when selecting an article to swipe to an old article and a new article? The swiping should be in order that the articles appear on the homepage. And can you number two replace the 11 labs model in the speech generation for daily digest with this model instead? Use a voice ID casual guy. And basically this application is an application for staying up to date with the latest AI news called Tenza AI and you can download it using a link in description down below. Uh but basically like it has more articles on the real version because this is a test version that I have running locally. Anyways, we can press enter over here and we can see what it comes up with. Whilst waiting, I actually found this page of the Kimmy uh Moonshot AI website platform and you can see that all the requests are actually going here. So when I go to billing details, request details over here and I can see there are input tokens, output tokens, cached tokens as

Checking the Result

well. And it seems that it finished making all the changes over here and it took about 913 seconds to do which is about 15 minutes. And last time when I used cloud code set or like cloud for sonet on cloud code, it took 13 minutes to do. So it took 2 minutes longer. So we'll see if that 2 minutes actually lead to a better result. So I'm going to close the application over here and then reopen it to give it a fresh start. So I restarted the development server and open the application up again. And swiping left, swiping right. The swiping actually does not work. It did add a swipe left for the next story. Swipe right. And now the scrolling down doesn't work either. So it seems a model didn't actually do a good job here. Maybe if I click on this, swipe left, swipe right, it added the arrows, but it didn't make the swiping work for some reason. So maybe if I say So the swiping doesn't actually work. It stays on the same page over here. If I put in that, uh, we'll see what changes it makes. But anyway, let's check if it made the, uh, miniax changes properly. So it did replace it with replica instead. It put in our replica token. It actually didn't get the version number correct, but it said that replace with the actual version number. This is a placeholder right now. Um, and it did managed to add a emotion section over here, which I'm quite surprised by. So, we can go to the file and then fix the placeholder that it added for now and then see if this actually works. So, we can fix this placeholder. So, one small issue made over here is that it didn't use the key. So, it didn't check our environment variables properly. Uh so we can just replace the token part with key over here and then if I invoke the function over here we can see if it actually generates the audio summary. So actually after extending the look back window over here it seems to have generated some audio and we'll just see if this is the one we want to know possible industry — and yeah and then the rest of the function continues as normal and the most important thing used replica to generate the audio this time and then uploaded it to our R2 bucket over here. So yeah it's actually quite good over here. Basically, all we had to do was change the placeholder and then correct the API token to API key. Now, we'll see how it's performed on the swipe issue. It seems that it says it's finished. So, we will just close the application and then restart the application again. And the swiping is still not working. So, yeah, it seems

Overall Thoughts

that it struggles with maybe UI related tasks. It seems it's better at like simpler tasks where it's just swapping out one thing for another thing rather than adding full scale new features. And yeah, basically Cloud Force on it did a much better job at actually implementing the swiping left and right and it didn't need as much guidance and it knew to add the like gesture view handler as well that I had to tell it to explicitly add. Um, so yeah, like I think that Kimmy K2, maybe Kim K3 will be even better. If you do want to use it, then you can use it for much more simpler tasks like swapping out one thing for another thing. But I think it's not there yet for production code bases. If you're interested in how many tokens it took so far, uh if we go to overview over here, it took about 40 cents from my like thing, I mean 39 cents. And if I close cloud code and then use a like token counter. Uh yeah, so today it's basically said that it used 250,000 input tokens and I'm sure it used a lot of cash tokens, but it shows a zero. uh because maybe it's not like set up or configured properly. It says that if I use a cloud code API instead, it would have been $124 because this is actually configured towards a cloud code API the pricing over here. But Kim K2 used 40 cents instead. So it's about three times cheaper for this particular task, but it did not actually complete the task and it also took much longer as well. So

OpenRouter

one thing you may want to consider doing to save on time and costs is you can follow this article over here. And it basically teaches you how to set up uh cloud code with open router over here and then to use kim K2 as a default model with open router and then you can set up custom routing rules such that it can route to a different model depending on the complexity of the task. So, it says you can add more models over here. You can uh from open router and other providers in your configuration file and create custom routing rules to use the best model for each task. So maybe you can set up a system where it automatically goes to claw force on it for particular tasks which are much harder and then it goes to communic 2 for any easier tasks. But then again, for simplicity, you may just want to use claw force on it for

Другие видео автора — Ray Amjad

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник