Unified Reinforcement and Imitation Learning for Vision-Language Models Paper • 2510.19307 • Published Oct 22, 2025 • 30
Masking Teacher and Reinforcing Student for Distilling Vision-Language Models Paper • 2512.22238 • Published 11 days ago • 17
Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Paper • 2512.20848 • Published 11 days ago • 28
FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos Paper • 2512.10927 • Published 23 days ago • 5
4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation Paper • 2512.17012 • Published 16 days ago • 42
Generative Refocusing: Flexible Defocus Control from a Single Image Paper • 2512.16923 • Published 16 days ago • 37
4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation Paper • 2512.17012 • Published 16 days ago • 42
4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation Paper • 2512.17012 • Published 16 days ago • 42
Zoom-Zero: Reinforced Coarse-to-Fine Video Understanding via Temporal Zoom-in Paper • 2512.14273 • Published 18 days ago • 7
Zoom-Zero: Reinforced Coarse-to-Fine Video Understanding via Temporal Zoom-in Paper • 2512.14273 • Published 18 days ago • 7
Cosmos-Reason2 Collection Cosmos Reason 2 is an open, customizable, reasoning vision language model (VLM) for physical AI and robotics • 14 items • Updated 9 days ago • 7
BlurDM: A Blur Diffusion Model for Image Deblurring Paper • 2512.03979 • Published about 1 month ago • 3
BlurDM: A Blur Diffusion Model for Image Deblurring Paper • 2512.03979 • Published about 1 month ago • 3
BlurDM: A Blur Diffusion Model for Image Deblurring Paper • 2512.03979 • Published about 1 month ago • 3 • 2
VADER: Towards Causal Video Anomaly Understanding with Relation-Aware Large Language Models Paper • 2511.07299 • Published Nov 10, 2025 • 5
VADER: Towards Causal Video Anomaly Understanding with Relation-Aware Large Language Models Paper • 2511.07299 • Published Nov 10, 2025 • 5
VADER: Towards Causal Video Anomaly Understanding with Relation-Aware Large Language Models Paper • 2511.07299 • Published Nov 10, 2025 • 5 • 3