NEW Chinese AI Agent is INSANE!
8:25

NEW Chinese AI Agent is INSANE!

Julian Goldie SEO 21.01.2026 5 424 просмотров 165 лайков обн. 18.02.2026
Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
Want to make money and save time with AI? Get AI Coaching, Support & Courses 👉 https://www.skool.com/ai-profit-lab-7462/about Get a FREE AI Course + 1000 NEW AI Agents + Video Notes 👉 https://www.skool.com/ai-seo-with-julian-goldie-1553/about Want to know how I make videos like these? Join the AI Profit Boardroom → https://www.skool.com/ai-profit-lab-7462/about Get a FREE AI SEO Strategy Session: https://go.juliangoldie.com/strategy-session?utm=julian Sponsorship inquiries:  https://docs.google.com/document/d/1EgcoLtqJFF9s9MfJ2OtWzUe0UyKu1WeIryMiA_cs7AU/edit?tab=t.0 This Tiny 10B Chinese AI is Beating Models 20x Its Size! Step 3 VL10B is a revolutionary open-source multimodal model that punches far above its weight class by outperforming giants like Gemini Pro. We explore the 'PiCoRa' technology and unified pre-training that allows this 10B model to deliver elite performance on your own hardware. 00:00 - 00:00 - Intro 00:28 - Meet Step 3 VL10B 01:28 - The 3 Secret Training Weapons 02:13 - How Parallel Reasoning (PiCoRa) Works 02:43 - Benchmark Results vs. Gemini Pro 04:08 - The Shift to Efficient Open Source AI 04:36 - OCR, GUI, and STEM Use Cases 07:38 - How to Get Started and Download

Оглавление (8 сегментов)

  1. 0:00 Intro 73 сл.
  2. 0:28 Meet Step 3 VL10B 204 сл.
  3. 1:28 The 3 Secret Training Weapons 135 сл.
  4. 2:13 How Parallel Reasoning (PiCoRa) Works 97 сл.
  5. 2:43 Benchmark Results vs. Gemini Pro 233 сл.
  6. 4:08 The Shift to Efficient Open Source AI 75 сл.
  7. 4:36 OCR, GUI, and STEM Use Cases 582 сл.
  8. 7:38 How to Get Started and Download 148 сл.
0:00

Intro

New Chinese AI agent is insane. A just dropped and it's destroying models 20 times its size. I'm talking about a tiny 10 billion parameter model, beating Google's Gemini, a massive 200 billion parameter giants. This thing runs on your laptop but thinks like a supercomput. And the crazy part is it's completely free and open source. Today I'm showing you exactly how this works and why everyone's losing their minds. Okay, so
0:28

Meet Step 3 VL10B

there's this new AI model called Step 3 VL10B that just came out. And when I saw the benchmarks, I thought someone made a mistake. This thing is only 10 billion parameters. That's tiny in the AI world, but it's beating models that are 10 times bigger, 20 times bigger. Even some of the best models out there, we're talking about a David versus Goliath situation, except David is winning every single fight. So what is this thing? Step 3 VL10B is a multimodal AI model. That means it can see images and read text at the same time. It can do OCR. It can understand screenshots. It can solve math problems. It can do spatial reasoning. Basically, everything the big models do. But here's the kicker. It's open source. You can download it right now. You can run it on your own hardware. No rate limits, nothing. Hey, if we haven't met already, I'm the digital avatar of Julian Goldie, CEO of SEO agency Goldie Agency. Whilst he's helping clients get more leads and customers, I'm here to help you get the latest AI updates. Julian Goldie reads every comment, so make sure you comment below. Now, let me break down why this
1:28

The 3 Secret Training Weapons

model is actually insane. The team behind this used three secret weapons. The first thing is unified pre-training. They train this model on 1. 2 trillion tokens of images and text mixed together from the very start. Most models learn vision and language separately, then try to connect them later. This one learned them together. That's why it understands visual stuff so well. All right. Second thing is reinforcement learning at massive scale. They didn't just train it once and call it done. They ran over 1,000 reinforcement learning iterations. They used supervised fine-tuning. They used something called RLVR, which is reinforcement learning with verifiable rewards. They used human feedback. All of this teaching the model how to actually reason and think through problems instead of just pattern matching. But here's the real magic. The
2:13

How Parallel Reasoning (PiCoRa) Works

third secret weapon, it's called parallel coordinated reasoning or picora for short. And this is what makes the model punch way above its weight. When you ask this model a question, it doesn't just think through one answer. It creates 16 different reasoning paths at the same time. 16 parallel minds all thinking about your problem from different angles. Then it takes all those answers and synthesizes them into one final response. It's like having a whole team of experts instead of just one person. And that's why a small model can compete with giants. The benchmarks
2:43

Benchmark Results vs. Gemini Pro

are absolutely wild. on MMbench, which tests general multimodal understanding. This thing scores 92. 2%. That's better than models with over 100 billion parameters. On AR 2025, which is super hard math reasoning, it scores 94. 43%. That's outranking models that are literally 20 times bigger. On MMU, which tests knowledge across multiple subjects, it hits 80. 11%. These aren't just good scores. These are elite tier scores that you usually only see from the biggest most models. So, let me put this in perspective. There's a model called GLM 4. 6V that has 106 billion parameters. Step 3VL matches or beats it on several benchmarks. There's Quen 3VL with 235 billion parameters. Step 3VL is competitive with it despite being 20 times smaller. Even Google's Gemini 2. 5 Pro, Step 3VL, is approaching those performance levels. With 10 billion parameters running on hardware, you can actually afford. And if you want to learn how to actually use AI tools like step 3VL to scale your business and automate everything, you need to check out the AI profit boardroom. This is where I show you exactly how to save hundreds of hours with AI automation and get more customers without working harder. You'll learn how to implement cuttingedge AI models like this one into your actual business processes, not just theory. Real actionable automation link is in the description. So why does this
4:08

The Shift to Efficient Open Source AI

matter for you? Three big reasons. First, it democratizes AI. You can run frontier level AI on your own machine. Second, it's way more efficient. Smaller models mean faster responses, less energy usage. You can actually build products and services with this without burning through cash. Third, it's open source, which means you can customize it, fine-tune it for your specific use case, build it into your apps, do whatever you want with it. Now
4:36

OCR, GUI, and STEM Use Cases

let me show you what this thing can actually do. It's incredible. OCR. You can feed it a screenshot of a document and it'll extract all the text perfectly. It understands Gu interfaces. You can show it a screenshot of software and ask it to explain what each button does. It's amazing at spatial reasoning. Show it a diagram or a map. And it can understand relationships between objects. It crushes STEM problems, math, physics, chemistry. It can solve complex equations and explain the reasoning step by step. The model comes in two versions. There's step 3 VL10 base which is the core foundation model and there's step 3 VL10B chat which is instruction tuned for conversations. The chat version is what most people will want to use. You can download both of them from hugging face or model scope right now. They're completely free. The technical report just came out on January 14th, 2026. You can read all the details there. Here's what's really fascinating about the parallel reasoning approach. Traditional AI models think sequentially. They go step one, step two, step three. If they make a mistake early on, that mistake cascades through the whole answer. But with Packor, you have 16 different reasoning chains running at once. So if one chain makes a mistake, the other 15 can catch it. The model looks at all the different approaches and picks the best parts from each one. It's like crowd sourcing intelligence inside a single model. This is why smaller models with better reasoning can beat bigger models with brute force approaches. And the step three team proved that with the right training techniques and the right inference strategies, you can get worldclass performance without needing hundreds of billions of parameters. Think about what this means for the future if a 10 billion parameter model can compete with models 20 times its size. What happens when this technology gets applied to bigger models? What happens when someone builds a 100 billion parameter model with Packor? We're probably looking at AI that's orders of magnitude more capable than anything we have today. And it's all happening faster than anyone expected. The team behind this is from China and they're absolutely crushing it right now. While everyone in the west is focused on making models bigger and bigger, the Chinese teams are figuring out how to make models smarter, more efficient, more accessible, and they're winning. This isn't the first time either. We've seen Chinese labs release incredible open-source models over and over again, and they keep getting better. The use cases are endless. You could build an automated data extraction service that reads invoices and receipts. You could create a GUI testing tool that understands user interfaces. You could make an educational app that solves math problems and explains the steps. You could build a content moderation system that understands images and text together. All of this with a model you can run on your own hardware. If you want to stay ahead of the curve, you need to be experimenting with these tools. Now, download the model, play around with it, see what it can do, figure out how to integrate it into your workflow. The people who master AI early are going to have a massive advantage over everyone else. And tools like step 3VL make it easier than ever to get started. So, go check it out. Download the model, read the technical report, start building. And if
7:38

How to Get Started and Download

you want to learn how to actually use AI tools like step 3VL to scale your business and automate everything, you need to check out the AI profit boardroom. This is where I show you exactly how to save hundreds of hours with AI automation and get more customers without working harder. You'll learn how to implement cuttingedge AI models like this one into your actual business processes, not just theory, real actionable automation link is in the description. And if you want the full process, SOPs, and 100 plus AI use cases like this one, join the AI success lab links in the comments and description. You'll get all the video notes from there, plus access to our community of 38,000 members who are crushing it with AI. All right, thanks for watching. Hit the like and subscribe button and I will see you in the next

Ещё от Julian Goldie SEO

Ctrl+V

Экстракт Знаний в Telegram

Транскрипты, идеи, методички — всё самое полезное из лучших YouTube-каналов.

Подписаться