On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published Aug 7, 2025 • 181
naver-hyperclovax/HyperCLOVAX-SEED-Think-14B Text Generation • 15B • Updated Aug 27, 2025 • 4.34k • 111
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 Image-to-Text • 402B • Updated May 22, 2025 • 27.9k • 147