view article Article LateOn-Code & ColGrep: LightOn unveils state-of-the-art code retrieval models and code search tooling 3 days ago • 41
Intern-S1: A Scientific Multimodal Foundation Model Paper • 2508.15763 • Published Aug 21, 2025 • 269
PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning Paper • 2601.05593 • Published Jan 9 • 84
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability Paper • 2601.18778 • Published 20 days ago • 40
OptiMind: Teaching LLMs to Think Like Optimization Experts Paper • 2509.22979 • Published Sep 26, 2025 • 4
The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models Paper • 2601.15165 • Published 25 days ago • 72
Clara-Molecular Collection NVIDIA Clara Models for Molecular Science • 10 items • Updated 10 days ago • 7
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published Jan 8 • 225
Dr. Zero: Self-Evolving Search Agents without Training Data Paper • 2601.07055 • Published Jan 11 • 20
RelayLLM: Efficient Reasoning via Collaborative Decoding Paper • 2601.05167 • Published Jan 8 • 31
Jamba Reasoning 3B Collection AI21's top-performing reasoning model that packs leading scores on intelligence benchmarks and highly-efficient processing into a compact 3B build • 2 items • Updated Oct 8, 2025 • 6
Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 26 items • Updated 18 days ago • 109
Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits Paper • 2512.20578 • Published Dec 23, 2025 • 86
DFlash Collection Block Diffusion for Flash Speculative Decoding • 5 items • Updated 8 days ago • 17