Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
mitkox 
posted an update 3 days ago
Post
3138
I just stress-tested the Beast: MiniMax-M2.1 on Z8 Fury G5.
2101 tokens/sec. FORTY concurrent clients. That's 609 t/s out, 1492 t/s in. The model outputs fire faster than I can type, but feeds on data like a black hole on cheat day.
But wait, there's more! Threw it into Claude Code torture testing with 60+ tools, 8 agents (7 sub-agents because apparently one wasn't enough chaos). It didn't even flinch. Extremely fast, scary good at coding. The kind of performance that makes you wonder if the model's been secretly reading Stack Overflow in its spare time lol
3 months ago, these numbers lived in my "maybe in “2030 dreams. Today it's running on my desk AND heaths my home office during the winter!

🔥 Got any pics of this rig? Would love to see how it's managing thermals.

Honestly,
looks very cool and stable,
almost like it is boring?

google says 'Mitko Vasilev, a CTO who runs the model on this configuration, reported impressive stress test results: the setup achieved 2101 tokens/second across forty concurrent clients. The user emphasizes the capability of the Z8 Fury G5 to handle "enterprise-grade AI throughput" on a local desktop, positioning it as a powerful, cost-effective alternative to cloud GPU service'. Seems Google needs tp update the results from here.