Quick Glance
- What: GLM-4.5 and GLM-4.5-Air—two new open-source “agentic” AI models from Zhipu AI (China)
- Size: 355 B total / 32 B active (GLM-4.5) and 106 B total / 12 B active (GLM-4.5-Air)
- License: MIT—free for commercial use, fine-tuning, and resale
- Best at: Writing code, calling tools, long-context reasoning, building slide decks, and anything “agent-like”
- Where to try: Hugging Face (slow queue) or chat.z.ai (faster, limited free tier)
Outline
- Why this launch matters
- Two models, two speeds
- Training recipe—why it’s built for agents
- Benchmarks—how it stacks up against GPT-4.1, Claude, Gemini
- Real-world test drive
- From a single prompt to a ready-to-run ad video
- From a vague idea to a finished PowerPoint
- How to start using GLM-4.5 today
- Bottom line—should you care?
1. Why this launch matters
Open-source AI usually comes with strings attached: research-only licenses, capped commercial use, or partial weights.
Zhipu AI just flipped that script. By dropping a 355-billion-parameter mixture-of-experts (MoE) model under the permissive MIT license, they’ve handed startups, indie devs, and hobbyists the same firepower that was locked behind closed APIs last month.
2. Two models, two speeds
Table
Copy
Model | Total params | Active params | Sweet spot |
---|---|---|---|
GLM-4.5 | 355 B | 32 B | Heavy coding, complex agents |
GLM-4.5-Air | 106 B | 12 B | Fast chat, lighter hardware needs |
Think of Air as the “iPad” edition—smaller, snappier, and still shockingly capable.
3. Training recipe—why it’s built for agents
Pre-training buffet
- 15 T tokens of general text (books, web pages, Harry Potter—you name it)
- 7 T tokens of code + reasoning data
- Entire repository-level code thrown in so the model “sees” folders, not just snippets
Post-training polish
- Long-context “agentic” tuning—synthetic tasks that mimic browsing, tool-calling, and multi-step planning
- Real-world software feedback—the model’s code suggestions are executed, tested, and graded automatically
The result? GLM-4.5 doesn’t just talk about tools; it uses them.
4. Benchmarks—how it stacks up
Table
Copy
Benchmark (SWE-bench verified) | GLM-4.5 | GLM-4.5-Air | GPT-4.1 | Claude 3.5 Sonnet | Gemini 2.5 Pro |
---|---|---|---|---|---|
Pass rate | 64.2 % | 57 % | 48 % | 70 % | 49 % |
- Second place only to Claude Sonnet on this coding set
- Beats GPT-4.1 by 16 points—a jump most devs will feel
5. Real-world test drive
5.1 Thirty-second ad in one prompt
We fed both GLM-4.5 and GPT-4o the same prompt:
“Rewrite this IKEA-style ad script for the Nothing Phone.”
- GPT-4o nailed the vibe but hallucinated the phone design.
- GLM-4.5 asked clarifying questions, then produced a JSON storyboard ready for Veo-3.
Both scripts ran in Runway; GLM’s JSON needed one less edit.
5.2 PowerPoint on autopilot
Prompt:
“Create a 10-slide deck for an AI-agents productivity workshop.”
GLM-4.5 scraped live sources, pulled icons, and emitted an HTML slide deck that looked like PowerPoint—no copy-paste marathon.
6. How to start using GLM-4.5 today
Option A: Hugging Face
- Search “THUDM/GLM-4.5”
- Expect a queue; it’s popular
Option B: Z.ai Chat (Recommended)
- Go to chat.z.ai
- Sign in with Google/GitHub
- Free tier gives ~20 messages/day—enough to poke around
Option C: Self-host
If you have:
- 8×A100 80 GB → GLM-4.5-Air
- 16×A100 80 GB → GLM-4.5
Docker images and vLLM configs are already on the repo.
7. Bottom line—should you care?
If you build with code, agents, or anything that needs cheap, top-tier reasoning, GLM-4.5 is a no-brainer. It’s free, open, and outperforms models you’re probably paying for. China isn’t slowing down—and neither should you.