Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published Dec 1, 2025 • 95
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published Dec 1, 2025 • 95
RMTBench: Benchmarking LLMs Through Multi-Turn User-Centric Role-Playing Paper • 2507.20352 • Published Jul 27, 2025