-
LinFusion: 1 GPU, 1 Minute, 16K Image
Paper • 2409.02097 • Published • 34 -
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Paper • 2409.11406 • Published • 27 -
Diffusion Models Are Real-Time Game Engines
Paper • 2408.14837 • Published • 126 -
Segment Anything with Multiple Modalities
Paper • 2408.09085 • Published • 22
Collections
Discover the best community collections!
Collections including paper arxiv:2511.14993
-
Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy
Paper • 2507.01352 • Published • 60 -
A Data-Centric Framework for Addressing Phonetic and Prosodic Challenges in Russian Speech Generative Models
Paper • 2507.13563 • Published • 53 -
Scaling Laws for Optimal Data Mixtures
Paper • 2507.09404 • Published • 38 -
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation
Paper • 2511.14993 • Published • 233
-
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times
Paper • 2512.16093 • Published • 97 -
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer
Paper • 2511.22699 • Published • 245 -
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI
Paper • 2512.16676 • Published • 222 -
Sharp Monocular View Synthesis in Less Than a Second
Paper • 2512.10685 • Published • 29
-
Attention Is All You Need
Paper • 1706.03762 • Published • 122 -
Scaling Laws for Neural Language Models
Paper • 2001.08361 • Published • 10 -
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 11 -
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT
Paper • 2210.04186 • Published
-
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence
Paper • 2511.18538 • Published • 304 -
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation
Paper • 2511.14993 • Published • 233 -
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
Paper • 2509.26507 • Published • 550 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 514
-
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation
Paper • 2511.14993 • Published • 233 -
Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks
Paper • 2511.15065 • Published • 78 -
SAM 3: Segment Anything with Concepts
Paper • 2511.16719 • Published • 135 -
Canvas-to-Image: Compositional Image Generation with Multimodal Controls
Paper • 2511.21691 • Published • 36
-
SPATIALGEN: Layout-guided 3D Indoor Scene Generation
Paper • 2509.14981 • Published • 28 -
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation
Paper • 2511.14993 • Published • 233 -
Kling-Omni Technical Report
Paper • 2512.16776 • Published • 173 -
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length
Paper • 2512.04677 • Published • 177
-
LinFusion: 1 GPU, 1 Minute, 16K Image
Paper • 2409.02097 • Published • 34 -
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Paper • 2409.11406 • Published • 27 -
Diffusion Models Are Real-Time Game Engines
Paper • 2408.14837 • Published • 126 -
Segment Anything with Multiple Modalities
Paper • 2408.09085 • Published • 22
-
Attention Is All You Need
Paper • 1706.03762 • Published • 122 -
Scaling Laws for Neural Language Models
Paper • 2001.08361 • Published • 10 -
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 11 -
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT
Paper • 2210.04186 • Published
-
Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy
Paper • 2507.01352 • Published • 60 -
A Data-Centric Framework for Addressing Phonetic and Prosodic Challenges in Russian Speech Generative Models
Paper • 2507.13563 • Published • 53 -
Scaling Laws for Optimal Data Mixtures
Paper • 2507.09404 • Published • 38 -
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation
Paper • 2511.14993 • Published • 233
-
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence
Paper • 2511.18538 • Published • 304 -
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation
Paper • 2511.14993 • Published • 233 -
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
Paper • 2509.26507 • Published • 550 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 514
-
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times
Paper • 2512.16093 • Published • 97 -
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer
Paper • 2511.22699 • Published • 245 -
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI
Paper • 2512.16676 • Published • 222 -
Sharp Monocular View Synthesis in Less Than a Second
Paper • 2512.10685 • Published • 29
-
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation
Paper • 2511.14993 • Published • 233 -
Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks
Paper • 2511.15065 • Published • 78 -
SAM 3: Segment Anything with Concepts
Paper • 2511.16719 • Published • 135 -
Canvas-to-Image: Compositional Image Generation with Multimodal Controls
Paper • 2511.21691 • Published • 36
-
SPATIALGEN: Layout-guided 3D Indoor Scene Generation
Paper • 2509.14981 • Published • 28 -
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation
Paper • 2511.14993 • Published • 233 -
Kling-Omni Technical Report
Paper • 2512.16776 • Published • 173 -
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length
Paper • 2512.04677 • Published • 177