FineVerify: Scaling Test-Time Compute with Fine-Grained Self-Verification for Agentic Search Paper • 2606.00660 • Published 21 days ago • 8
Learning to Learn-at-Test-Time: Language Agents with Learnable Adaptation Policies Paper • 2604.00830 • Published Apr 2 • 15
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use Paper • 2509.01055 • Published Sep 1, 2025 • 81
MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research Paper • 2505.19955 • Published May 26, 2025 • 14