Chinese AI startup Zhipu AI has launched its new-generation open-source large language model series, GLM-4.5 and GLM-4.5 Air, designed to unify advanced reasoning, coding, and agentic capabilities. The flagship GLM-4.5 model features 355 billion total parameters with 32 billion active parameters using a mixture-of-experts (MoE) architecture, while the lighter GLM-4.5 Air has 106 billion total parameters with 12 billion active. These models support two operational modes: a "thinking mode" for complex reasoning and tool use, and a "non-thinking mode" for instant responses. GLM-4.5 has demonstrated competitive performance against leading global AI models such as Claude 4 Opus and Gemini 2.5 Pro, excelling particularly in coding, reasoning, and agentic tasks. The models are offered under an MIT license with API pricing at $0.6 per million input tokens and $2.2 per million output tokens for GLM-4.5, and lower rates for GLM-4.5 Air. The launch marks a milestone in China's AI development, with GLM-4.5 topping domestic evaluations and matching global top models while costing approximately one-tenth of Claude's API fees. The models have been trained on a 22 trillion-token corpus and incorporate advanced features such as GQA, partial RoPE, multi-token prediction, Muon optimizer, and native tool use capabilities. Meanwhile, other Chinese open-source models such as Qwen3 series continue to advance, with Qwen3 235B 2507 and Qwen3-Coder-30B-A3B models gaining traction for reasoning, coding, and agentic benchmarks, some offering local deployment with large context windows up to 1 million tokens. Additionally, Cerebras Systems has introduced hosted endpoints for Qwen3 models, providing fast inference speeds up to 2,000 tokens per second with subscription plans starting at $50 per month. This surge in Chinese open-source AI models reflects an ongoing acceleration in frontier AI development with multiple models achieving top rankings on international leaderboards and competitive performance against established Western AI systems.
Qwen3 Coder 480B or Sonnet 4? Who’s used Q3C for real stuff on large repos?
Ever wondered how Groq runs massive models in production so fast? Andrew Ling, Head of ML Compilers at Groq, breaks it down. Link in comments.
WHAT 🤯 China report this Deepseek-finetuned 671B param scientific model achieved 40.44% on HLE. That would place it well above most current frontier models Licensed apache 2.0 available on @huggingface 🧪 S1-Base is a set of open scientific language models built from 170M https://t.co/r0jCxmBxtG