Collections
Discover the best community collections!
Collections including paper arxiv:2604.02176
-
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling
Paper • 2603.25746 • Published • 155 -
TAPS: Task Aware Proposal Distributions for Speculative Sampling
Paper • 2603.27027 • Published • 144 -
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models
Paper • 2603.25716 • Published • 156 -
LongCat-Next: Lexicalizing Modalities as Discrete Tokens
Paper • 2603.27538 • Published • 147
-
dLLM: Simple Diffusion Language Modeling
Paper • 2602.22661 • Published • 153 -
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data
Paper • 2603.15594 • Published • 149 -
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence
Paper • 2603.13398 • Published • 155 -
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders
Paper • 2603.06569 • Published • 120
-
Adam's Law: Textual Frequency Law on Large Language Models
Paper • 2604.02176 • Published • 504 -
Demystifing Video Reasoning
Paper • 2603.16870 • Published • 372 -
A Very Big Video Reasoning Suite
Paper • 2602.20159 • Published • 524 -
LightMem: Lightweight and Efficient Memory-Augmented Generation
Paper • 2510.18866 • Published • 116
-
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 328 -
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process
Paper • 2512.23988 • Published • 19 -
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time
Paper • 2512.25075 • Published • 16 -
Guiding a Diffusion Transformer with the Internal Dynamics of Itself
Paper • 2512.24176 • Published • 8
-
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
Paper • 2509.26507 • Published • 550 -
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 328 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 132 -
LTX-2: Efficient Joint Audio-Visual Foundation Model
Paper • 2601.03233 • Published • 179
-
Adam's Law: Textual Frequency Law on Large Language Models
Paper • 2604.02176 • Published • 504 -
Demystifing Video Reasoning
Paper • 2603.16870 • Published • 372 -
A Very Big Video Reasoning Suite
Paper • 2602.20159 • Published • 524 -
LightMem: Lightweight and Efficient Memory-Augmented Generation
Paper • 2510.18866 • Published • 116
-
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling
Paper • 2603.25746 • Published • 155 -
TAPS: Task Aware Proposal Distributions for Speculative Sampling
Paper • 2603.27027 • Published • 144 -
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models
Paper • 2603.25716 • Published • 156 -
LongCat-Next: Lexicalizing Modalities as Discrete Tokens
Paper • 2603.27538 • Published • 147
-
dLLM: Simple Diffusion Language Modeling
Paper • 2602.22661 • Published • 153 -
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data
Paper • 2603.15594 • Published • 149 -
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence
Paper • 2603.13398 • Published • 155 -
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders
Paper • 2603.06569 • Published • 120
-
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 328 -
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process
Paper • 2512.23988 • Published • 19 -
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time
Paper • 2512.25075 • Published • 16 -
Guiding a Diffusion Transformer with the Internal Dynamics of Itself
Paper • 2512.24176 • Published • 8
-
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
Paper • 2509.26507 • Published • 550 -
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 328 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 132 -
LTX-2: Efficient Joint Audio-Visual Foundation Model
Paper • 2601.03233 • Published • 179