World-R1: Reinforcing 3D Constraints for Text-to-Video Generation Paper • 2604.24764 • Published 6 days ago • 114
Exploring Spatial Intelligence from a Generative Perspective Paper • 2604.20570 • Published 11 days ago • 21
OmniJigsaw: Enhancing Omni-Modal Reasoning via Modality-Orchestrated Reordering Paper • 2604.08209 • Published 24 days ago • 25
Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality Paper • 2512.07951 • Published Dec 8, 2025 • 51
Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs Paper • 2510.13795 • Published Oct 15, 2025 • 59
Build error Agents 1.19k ControlNet V1.1 📉 1.19k Generate edited images using edge, pose, and other guides
DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks Paper • 2502.17157 • Published Feb 24, 2025 • 52