Nano-BEIR: A Multilingual Information Retrieval Benchmark with Quality-Enhanced Queries 9 days ago • 5
**QVAC Genesis II: Expanding the Largest and Highest-Quality Multi-domain Educational Synthetic Dataset for LLM Pre-training** 12 days ago
Announcing LiteCoder-Terminal: Lightweight Terminal Agents with <1k Synthesized Trajectories 14 days ago • 9
Introducing AutoBench 2.0: Our New Benchmarking Platform is Out Just in Time to Evaluate GPT 5.2. 14 days ago • 1