FlashVID: Efficient Video Large Language Models via Training-free Tree-based Spatiotemporal Token Merging Paper • 2602.08024 • Published Feb 8 • 2
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations Paper • 2510.23607 • Published Oct 27, 2025 • 181
view article Article Faster Text Generation with Self-Speculative Decoding +2 ariG23498, melhoushi, pcuenq, reach-vb • Nov 20, 2024 • 65
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception Paper • 2505.04410 • Published May 7, 2025 • 44
Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models Paper • 2505.04921 • Published May 8, 2025 • 187
view article Article Mixture of Experts Explained +4 osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq • Dec 11, 2023 • 1.13k
LLM4SR: A Survey on Large Language Models for Scientific Research Paper • 2501.04306 • Published Jan 8, 2025 • 35
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs Paper • 2406.18629 • Published Jun 26, 2024 • 42