Want to make money and save time with AI? Get AI Coaching, Support & Courses 👉 https://juliangoldieai.com/07L1kg
Get a FREE AI Course + 1000 NEW AI Agents 👉 https://juliangoldieai.com/5iUeBR
Want to know how I make videos like these? Join the AI Profit Boardroom → https://juliangoldieai.com/07L1kg
ZAI just dropped something wild. A vision model that reads entire books in one go. And it can actually do things with what it sees. This is GLM4. 6V and it's about to change everything. Hey, if we haven't met already, I'm the digital avatar of Julian Goldie, CEO of SEO agency Goldie Agency. Whilst he's helping clients get more leads and customers, I'm here to help you get the latest AI updates. Julian Goldie reads every comment, so make sure you comment below. [snorts] Right. Zeli just released GLM 4. 6V and this thing is absolutely insane. I'm talking about a vision language model that can hold 128,000 tokens in context. That's not a typo. 128,000 tokens. You can throw entire books at this thing. Multi-page documents with images, slide decks, legal contracts, research papers, and it keeps everything in memory. But here's where it gets really crazy. This model doesn't just read and understand images. It can actually call functions, native function calling. What does that mean? It means the model can see an image, understand it, and then perform actions based on what it sees. Pass a chart, extract a table, save it as a CSV, call a database, trigger automation, all natively. No clunky workarounds. The model does it all. Let me break down what we're dealing with here. Zeli released two versions. The first one is GLM 4. 6V. This is the flagship 1. 6 6 billion parameters, massive, built for cloud and cluster use, strong multimodal understanding. Then there's GLM 4. 6V flash. Only 9 billion parameters, incredibly fast, optimized for local inference. You can run this on your desktop, on your laptop, even on edge devices. No need to send everything to the cloud. This is huge for privacy, huge for speed. Now, let me explain why that 128k context window matters. Most vision models choke on long documents. You feed them a few pages and they start forgetting things. But GLM 4. 6V, it can take a 50page manual with images and diagrams, and you can ask it targeted questions about page 47, and it remembers everything from page one, legal document analysis, research papers with charts and graphs, multi-page invoices, financial statements, all of it. When you have 128,000 tokens available, you're not just getting better memory, you're unlocking completely new workflows. You can analyze entire slide decks. You can compare multiple documents side by side. You can feed it a whole book and ask for themes across chapters. This isn't just an incremental improvement. This is a fundamental shift in what's possible. Now, let's talk about function calling. This is where things get really interesting. Traditional vision models can describe what they see, but they can't do anything with that information. GLM 4. 6V changes that it can see a chart, understand the data, and then call a function to extract that data into a structured format. It can see a receipt, pass the line items, and call a function to save everything to a database. It can see a form, extract the fields, and trigger an automation workflow. Now, here's where I want to pause for a second because I need to tell you about something that's going to save you massive amounts of time. If you're watching this video, you're clearly interested in AI tools and automation. And I want to invite you to check out AI Profit Boardroom. This is where we dive deep into tools exactly like GLM 4. 6V. We show you how to actually use these models to automate your business workflows. How to save hours every single day. How to build systems that run themselves. No fluff, no theory, just practical implementation. If you want to learn how to use vision models like this one to process documents, extract data, and automate repetitive tasks in your business, check out AI Profit Boardroom, link in the description. Now, let's get back to GLM 4. 6V. So, why does having two versions matter? The full GLM 4. 6V 6V model is 106 billion parameters. It needs serious compute power. You're running this in the cloud or on a cluster of GPUs. This is the version you use when you need maximum capability when accuracy matters more than speed. But GLM 4. 6V flash is only 9 billion parameters. Small enough to run locally on your own hardware. No cloud required. No API calls, no sending your data to external servers. Everything stays on your machine. This is massive for privacy sensitive applications. medical data, legal documents, financial information, and Flash is optimized for speed, low latency, quick responses, perfect for interactive applications. The fact that both models are available on hugging face with full weights, that's huge. You can download them, experiment with them, fine-tune them for your specific use case, build them into your products. And for the flash model, this means developers can build truly local AI applications, offline assistants, edge computing solutions, mobile apps with ondevice AI. Now, let's talk about actually using this thing. You can try GLM 4. 6V right now at chatzilai. It's live. You upload a multi-page PDF. Let's say it's a 10page document with images, charts, and dense text. You ask the model to summarize it, extract action items, identify key entities, generate a CSV of all the
people mentioned, and it does it all of it across all 10 pages with full context maintained throughout. For the function calling part, the API really shines. You can set up workflows where the model sees an image and automatically triggers actions. You show it a chart. It calls a function to extract the data. You show it a receipt. It calls a function to pass the totals. You show it a screenshot of a dashboard. It calls functions to extract metrics. This is vision to action in real time. Let me give you a real world example. Let's say you run an e-commerce company. You get hundreds of invoices every month from suppliers, different formats, PDFs, screenshots, scanned documents. Normally someone has to manually enter all that data, hours of work every week. With GLM 4. 6V, you automate the entire process. Upload the invoices. The model extracts supplier name, date, line items, totals, calls functions to validate the data, updates your database, flags any discrepancies for human review. What used to take hours now takes seconds. The pricing model is interesting, too. GLM 4. 6V 6V on the API costs around 60 cents per million input tokens and 90 cents per million output tokens. That's competitive, especially when you consider the 128k context window. And GLM 4. 6V Flash, according to the announcement, is listed as free. Having a capable 9 billion parameter model available without cost is incredible value. Here's what I think this means for the industry. We're moving beyond AI models that just analyze and describe. We're moving into AI models that perceive and act. vision models that can trigger real world consequences based on what they see and the fact that ZDI is making this available with open weights with a local first option that's setting a new standard. So what are the key takeaways? First, GLM 4. 6V brings 128K context to vision models which unlocks document understanding at a massive scale. Second, native function calling means these models can now act on what they see, not just describe it. Third, the flash variant makes all of this possible locally, which is huge for a privacy and speed. Fourth, open weights on hugface means developers can build with this immediately. If you're in document processing, automation or built-in AI applications, you need to pay attention to this release. This is a capability jump. Go to chatel. ai and test it. Download the weights from hugging face if you want to experiment locally. Read through the docs and see the API examples. And here's something else that's wild. The timing of this release matters. We're seeing a massive shift in the AI landscape right now. More open models, more local options, more tools that give developers real control. GLM4. 6V isn't just another model announcement. It's part of a bigger movement toward accessible, powerful AI that anyone can use. You don't need a massive budget. team of ML engineers. You can download these weights today and start building. That's the revolution happening right now. And ZAI is leading the charge with models like this. And speaking of implementation, this is exactly the kind of tool we break down in AI profit boardroom. We don't just tell you about new AI releases. We show you how to actually use them in your business, how to integrate them into your existing systems, how to build automation that saves you real time. If you want hands-on guidance for using models like GLM 4. 6V to transform your operations, check out AI Profit Boardroom. The link is in the description below. The future of AI is multimodal. It's high context. is actionoriented and with GLM 4. 6V that future is here today.