Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models Paper • 2512.20557 • Published 11 days ago • 48
UniUGP: Unifying Understanding, Generation, and Planing For End-to-end Autonomous Driving Paper • 2512.09864 • Published 24 days ago • 10
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM Paper • 2510.15870 • Published Oct 17, 2025 • 89
Progressive Gaussian Transformer with Anisotropy-aware Sampling for Open Vocabulary Occupancy Prediction Paper • 2510.04759 • Published Oct 6, 2025 • 9 • 2
AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use Paper • 2505.12650 • Published May 19, 2025 • 8
GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting Paper • 2311.11700 • Published Nov 20, 2023 • 4
Progressive Gaussian Transformer with Anisotropy-aware Sampling for Open Vocabulary Occupancy Prediction Paper • 2510.04759 • Published Oct 6, 2025 • 9
Progressive Gaussian Transformer with Anisotropy-aware Sampling for Open Vocabulary Occupancy Prediction Paper • 2510.04759 • Published Oct 6, 2025 • 9
MultiCrafter: High-Fidelity Multi-Subject Generation via Spatially Disentangled Attention and Identity-Aware Reinforcement Learning Paper • 2509.21953 • Published Sep 26, 2025 • 6
From One to More: Contextual Part Latents for 3D Generation Paper • 2507.08772 • Published Jul 11, 2025 • 25