Exploring MLLM-Diffusion Information Transfer with MetaCanvas Paper • 2512.11464 • Published 24 days ago • 12
RELIC: Interactive Video World Model with Long-Horizon Memory Paper • 2512.04040 • Published Dec 3, 2025 • 23
MagicQuillV2: Precise and Interactive Image Editing with Layered Visual Cues Paper • 2512.03046 • Published Dec 2, 2025 • 11
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 244
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper • 2511.08892 • Published Nov 12, 2025 • 200
World Simulation with Video Foundation Models for Physical AI Paper • 2511.00062 • Published Oct 28, 2025 • 40
From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model Paper • 2510.19871 • Published Oct 22, 2025 • 29
HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives Paper • 2510.20822 • Published Oct 23, 2025 • 40
Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset Paper • 2510.15742 • Published Oct 17, 2025 • 50
D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI Paper • 2510.05684 • Published Oct 7, 2025 • 141
ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data Paper • 2509.15221 • Published Sep 18, 2025 • 111
Rethinking Verification for LLM Code Generation: From Generation to Testing Paper • 2507.06920 • Published Jul 9, 2025 • 28
Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data Paper • 2507.07095 • Published Jul 9, 2025 • 55
Calligrapher: Freestyle Text Image Customization Paper • 2506.24123 • Published Jun 30, 2025 • 37
DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks Paper • 2502.17157 • Published Feb 24, 2025 • 52
Goku: Flow Based Video Generative Foundation Models Paper • 2502.04896 • Published Feb 7, 2025 • 106
MatchAnything: Universal Cross-Modality Image Matching with Large-Scale Pre-Training Paper • 2501.07556 • Published Jan 13, 2025 • 7