Can Large Language Models Keep Up? Benchmarking Online Adaptation to Continual Knowledge Streams Paper • 2603.07392 • Published Mar 8 • 18
MA-EgoQA: Question Answering over Egocentric Videos from Multiple Embodied Agents Paper • 2603.09827 • Published about 1 month ago • 30
AnchorWeave: World-Consistent Video Generation with Retrieved Local Spatial Memories Paper • 2602.14941 • Published Feb 16 • 6
When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning Paper • 2602.08236 • Published Feb 9 • 9
Reliable and Responsible Foundation Models: A Comprehensive Survey Paper • 2602.08145 • Published Feb 4 • 8
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Paper • 2601.00664 • Published Jan 2 • 57
WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning Paper • 2512.02425 • Published Dec 2, 2025 • 25
Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning Paper • 2507.06485 • Published Jul 9, 2025 • 5
MEXA: Towards General Multimodal Reasoning with Dynamic Multi-Expert Aggregation Paper • 2506.17113 • Published Jun 20, 2025 • 5
Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models Paper • 2506.07177 • Published Jun 8, 2025 • 23
Video-Skill-CoT: Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning Paper • 2506.03525 • Published Jun 4, 2025 • 6
EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance Paper • 2505.21876 • Published May 28, 2025 • 9
Distilling LLM Agent into Small Models with Retrieval and Code Tools Paper • 2505.17612 • Published May 23, 2025 • 81
RSQ: Learning from Important Tokens Leads to Better Quantized LLMs Paper • 2503.01820 • Published Mar 3, 2025 • 2
UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning Paper • 2502.15082 • Published Feb 20, 2025 • 1