TeleChat3-36B-Thinking: ✨ Native support for the Ascend + MindSpore ecosystem ✨ Inspired by DeepSeek’s architecture design, bringing training stability and efficiency gains.
StepFun has been focused on multimodal AI from the very beginning. Their latest release a new foundational model: STEP3-VL🔥 https://huggingface.co/collections/stepfun-ai/step3-vl-10b ✨ 10B - Apache2.0 ✨ Leads in the 10B class and competes with models 10–20× larger
✨ Hybrid Architecture: combined autoregressive + diffusion design delivers strong semantic alignment with high-fidelity details ✨ Strong performance in long, dense, and multilingual text rendering ✨ MIT licensed (VQ tokenizer & ViT weights under Apache 2.0) ✨ Now live on Hugging Face inference provider 🤗
AgentCPM-Explore🔥 on device agent foundation model released by OpenBMB openbmb/AgentCPM-Explore ✨ 4B - Apache2.0 ✨ Supports 100+ multi-turn environment interactions with search + verification ✨ Full training/inference stack is openly shared as well
✨ Big wave of foundation models: still scaling, but efficiency, reasoning, and deployment now matter more than size - DeepSeek-V3.2 - Z.ai GLM-4.7 - MiniMax-M2.1 - Xiaomi: MiMo-V2-Flash
✨ Multimodal reasoning is now default - Z.ai GLM-4.6V - Z.ai AutoGLM-Phone 9B - Bytedance: Dolphin-v2
Only a year into open source, MiniMax is already making a great impact. Not only through solid models/products, but also by how well the team uses community platforms like Hugging Face. HF Teams, blogs, Daily Papers, Spaces as project pages, and always experimenting with new ways to engage. Super impressive!