ReMix: Reinforcement routing for mixtures of LoRAs in LLM finetuning Paper • 2603.10160 • Published Mar 10 • 26
Toward Cognitive Supersensing in Multimodal Large Language Model Paper • 2602.01541 • Published Feb 2 • 16
Drive as You Speak: Enabling Human-Like Interaction with Large Language Models in Autonomous Vehicles Paper • 2309.10228 • Published Sep 19, 2023
On-Board Vision-Language Models for Personalized Autonomous Vehicle Motion Control: System Design and Real-World Validation Paper • 2411.11913 • Published Nov 17, 2024
MedSAM3: Delving into Segment Anything with Medical Concepts Paper • 2511.19046 • Published Nov 24, 2025 • 55
NTK-approximating MLP Fusion for Efficient Language Model Fine-tuning Paper • 2307.08941 • Published Jul 18, 2023 • 1
Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond Paper • 2403.10667 • Published Mar 15, 2024 • 1
SelfElicit: Your Language Model Secretly Knows Where is the Relevant Evidence Paper • 2502.08767 • Published Feb 12, 2025
SocialGesture: Delving into Multi-person Gesture Understanding Paper • 2504.02244 • Published Apr 3, 2025
Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs Paper • 2506.21656 • Published Jun 26, 2025 • 16
Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance Paper • 2506.06444 • Published Jun 6, 2025 • 73
CORG: Generating Answers from Complex, Interrelated Contexts Paper • 2505.00023 • Published Apr 25, 2025 • 9
BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing Paper • 2503.13434 • Published Mar 17, 2025 • 27
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Paper • 2502.18449 • Published Feb 25, 2025 • 75
DisPose: Disentangling Pose Guidance for Controllable Human Image Animation Paper • 2412.09349 • Published Dec 12, 2024 • 8
SciCode: A Research Coding Benchmark Curated by Scientists Paper • 2407.13168 • Published Jul 18, 2024 • 17
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents Paper • 2401.00812 • Published Jan 1, 2024 • 12
What is the Visual Cognition Gap between Humans and Multimodal LLMs? Paper • 2406.10424 • Published Jun 14, 2024