Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement
Learning
Paper
•
2502.14768
•
Published
•
47
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement
Learning
Paper
•
2502.12853
•
Published
•
29
Diverse Inference and Verification for Advanced Reasoning
Paper
•
2502.09955
•
Published
•
18
Distillation Scaling Laws
Paper
•
2502.08606
•
Published
•
47
Small Models Struggle to Learn from Strong Reasoners
Paper
•
2502.12143
•
Published
•
39
OctoTools: An Agentic Framework with Extensible Tools for Complex
Reasoning
Paper
•
2502.11271
•
Published
•
18
CRANE: Reasoning with constrained LLM generation
Paper
•
2502.09061
•
Published
•
21
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via
Reinforcement Learning
Paper
•
2501.12948
•
Published
•
433
Visual-RFT: Visual Reinforcement Fine-Tuning
Paper
•
2503.01785
•
Published
•
85
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through
Two-Stage Rule-Based RL
Paper
•
2503.07536
•
Published
•
88
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large
Language Models
Paper
•
2503.06749
•
Published
•
31
Unified Reward Model for Multimodal Understanding and Generation
Paper
•
2503.05236
•
Published
•
122
R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model
Paper
•
2503.05132
•
Published
•
57
START: Self-taught Reasoner with Tools
Paper
•
2503.04625
•
Published
•
113
R1-VL: Learning to Reason with Multimodal Large Language Models via
Step-wise Group Relative Policy Optimization
Paper
•
2503.12937
•
Published
•
30