Kimmy Moonshot. So, he's at the company building these Kimmy AI models. He posted this post that was later deleted. You know, I found Harve who reposted it. So, that's why we're able to see it. So, this is no longer on Exois. I can tell, but this is a post by Yulan. So, again, he works for Kimmy. He's saying, "Wait, we tested with composer 2 model API and found out that tokenizer is indeed the same with our Kimmy tokenizer. " Right? So, somebody from the Kimmy team is going, "Yeah, this is indeed our model or at least the base of the base model is that he's saying we can almost confirm this is our model post trained further. We are shocked that Cursor AI did not respect our license, nor did they pay us any fees. " and he's saying, "Michael Truel, why did you do this? " So then that post disappears as far as we can tell and later this Kimmy. ai post appears. They're saying, "Congrats to the cursor team on launch of Composer 2. We are proud to see Kimmy K2. 5 provide the foundation. Seeing our model integrated effectively through Cursor's continued pre-training and high compute RL training is the open model ecosystem we love to support. So looks like Fireworks AI is the inference provider and so Cursor accesses Kim K 2. 5 via Fireworks AI. It's a hosted reinforcement learning and inference platform as part of an authorized commercial partnership. All right, so what just happened? Let's review. So Cursor launches Composer 2 March 18th, I believe it came out. describes it as frontier level intelligence at a low cost and basically presents this as their own model without mentioning you know China or Moonshot or Kimmy or anything. So it's really sounded I think to most of us like they built it from scratch because indeed it was presented seemingly as if it was built from scratch. No mention of any open source model was there. Later, an ex user finds the fact that they're still referring in some URL to that model as a Kimmy K2. 5. And within hours of Reddit and X and hacker use everywhere, employees working at Kimmy are going, hey, what's going on? Why aren't you crediting us with using our models? Employees over at Cursor are posting, you know, okay, yes, we've used this as the base model. Still not using the name until they get called out for it. So here in this original one, no mention of Kimmy, right? So there's some people calling them out saying, you know, why is everything coming after the leak, you're still not giving credit to the open- source model. And then later as a response, they're saying, you know, here's the confirmation that we're using Kimmy since people really want me to say this Kimmy K2. 5. Yes, that is the base we started from and we are following the license through inference partner terms. And when people keep pressing them for more answers, why aren't you disclosing what the model is? They post this, this is from cursor. It's called training composer for a longer horizons. It's a very interesting blog post that reveals how they've approached this whole thing. It's actually technically a very interesting post and it kind of reveals what they built on top of Kimmy. So here they describe this concept of self summarization. Basically the model pauses mid task as it's running some task summarizes everything it knows so far into about a thousand tokens and then continues on with that compressed context and as they say here by making self summarization part of composer's training. We can get training signal from trajectories much longer than the models max context window. Now, we'll come back to this blog post in just a second here because it has some very interesting points, including you'll never guess what appears halfway through. Doom. But really fast, in case you're wondering, why didn't they actually mention Kimmy, was this some attempt to use something without disclosing it or using the license or paying some fees or was something stolen? The answer is no. As far as we can tell, everything was above board. Everything was fine. A lot of the open-source community wanted people to attribute it to the right model for them to say Kimmy. This was based on Kimmy. And there are great arguments for why that should happen. It gives a credit to the original open source community. It also gives a signal to a lot of other people that are building on a community that hey, like we're all behind this open source community that if you build something amazing, people will give you attribution. So if it seems like people are taking advantage without sort of putting stuff back in without attributing it does ruffle a lot of people's feathers and members of cursor did sort of state that yes they should have you know in the future they know now that they need to make sure they state that they put the attribution etc. With that said let's assume we gave them a stern finger wagging and they've learned their lesson. Let's assume
found that it was able to solve this problem correctly. The solution required engineering and testing a significant amount of code as well as exploring some alternative implementations. Here's an image rendered in the course of solving the problem. By the way, um this is a bit embarrassing, but so this image, if you're looking at it, if you remember how the original looks, you're probably like, "Oh, this is kind of like warped and weird. " It's not. Here's the thing. I have a green screen behind me. And then I'm using a software that filters out that particular green color, and I'm able to add, you know, various effects such as being over this website that we're looking at here. It's all very fancy. But the downside is apparently that specific color green also gets filtered out. Sorry, ADHD moment. The point is composer worked for 170 turns to find an exact solution along the way creating self summaries in a compact human readable and structured form. So it self summarized more than 100,000 tokens down to the 1,000 it believed would most help it solve the problem. All right so in the end what does this all mean? Number one, yes, in a perfect world cursor should have said, you know, Kimmy K2. 5 was the base model. the people that are upset and yelling at cursor right now. I mean, you know, they do have a point. Point number two, you know, did cursor do this to pull a fast one to trick people? I don't think so. I think it was done to avoid the drama from kind of like the geopolitical sensitive issues surrounding US versus China. It really doesn't seem like they just took Kimmy, rebranded it, put their own label on it, and resold it. That's really It doesn't seem like that's what's happening here. Again, they put more compute than the original model was, you know, for they used for pre-training. Also, they built on top of an open source model and they did put out this blog post, you know, they do have this which is what the composer's self summary was. So, they're publishing their research, right? So, so they're part of the open source ecosystem. They're doing things a lot of things, right? They're adding to it. They're contributing to it. And they're showing some pretty powerful things you can do with these open-source models and your own data and how you can use your own reinforcement learning training to take something that is good and turning into something pretty amazing, pretty great. So here's Clement Delang. So he is the founder currently running Hugging Face. This is one of probably the top open-source AI sort of champions, people that are getting it out there, enabling open-source AI to be used and doing a lot for the ecosystem as a whole. I mean, the dude runs HuggyFace, right? So, he's saying open source keeps being the greatest competition enabler. Another validation for Chinese open source that is now the biggest force shaping the global AI stack. And the frontier is no longer just about who trains from scratch, but who adapts, fine-tunes, and productizes fastest. Seeing the same thing with openclaw, for example. So I think all in all, it's a big win for open-source. It does bring the question of how many other models are there out there where companies are saying, "Yeah, we trained our own in-house model, but they just really were able to cover the tracks that it's from some other Chinese open- source model. " Is it possible that Cursor is the first company that thought of doing this and they were just the first that got caught? I don't know. But all in all, I think all is well, that ends okay. And I hope Cursor does pre-train their own model next, if that makes sense from a technical perspective. More importantly, I hope that they continue doing research like this, improving the models and posting what they've done. But let me know if you agree, disagree. Do you think they should get continued outrage over this? Again, I don't think I'd be saying that this was a great thing to do. I do in this video just wanted to point out that they've also contributed a lot. And I think the whole China situation is what caused the sensitivity in the first place. I don't think this was just a loweffort way to steal something. I don't think that's the right narrative here. But let me know what you think in the comments. If you made this far, thank you so much for watching. I will see you in the next one. My name is Wes Roth. Please consider subscribing. My channel has like the highest views to subs ratio I think in this entire industry on YouTube for AI. I mean, oh why can't we make this official? Are you embarrassed of being seen with me? Am I just like on the side? Hit subscribe if you're not subscribed. Hit subscribe. Let's make this official and I'll see you in the next