Parallel-R1: Towards Parallel Thinking via Reinforcement Learning Paper • 2509.07980 • Published Sep 9, 2025 • 103
EgoActor: Grounding Task Planning into Spatial-aware Egocentric Actions for Humanoid Robots via Visual-Language Models Paper • 2602.04515 • Published 9 days ago • 38
MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration Paper • 2602.01734 • Published 11 days ago • 32
🧠Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 24 items • Updated May 19, 2025 • 182
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published Nov 23, 2025 • 296
The End of Manual Decoding: Towards Truly End-to-End Language Models Paper • 2510.26697 • Published Oct 30, 2025 • 117
The Era of Agentic Organization: Learning to Organize with Language Models Paper • 2510.26658 • Published Oct 30, 2025 • 29
AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders Paper • 2510.19779 • Published Oct 22, 2025 • 61
QueST: Incentivizing LLMs to Generate Difficult Problems Paper • 2510.17715 • Published Oct 20, 2025 • 35