MMDeepResearch-Bench: A Benchmark for Multimodal Deep Research Agents Paper • 2601.12346 • Published 17 days ago • 49
Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models Paper • 2601.14004 • Published 14 days ago • 46
MMFormalizer: Multimodal Autoformalization in the Wild Paper • 2601.03017 • Published 28 days ago • 105 • 7
MMFormalizer: Multimodal Autoformalization in the Wild Paper • 2601.03017 • Published 28 days ago • 105
MMFormalizer: Multimodal Autoformalization in the Wild Paper • 2601.03017 • Published 28 days ago • 105
ATTS: Asynchronous Test-Time Scaling via Conformal Prediction Paper • 2509.15148 • Published Sep 18, 2025
TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language Models Paper • 2310.10180 • Published Oct 16, 2023 • 1
D2O: Dynamic Discriminative Operations for Efficient Generative Inference of Large Language Models Paper • 2406.13035 • Published Jun 18, 2024 • 3
UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation Paper • 2410.02719 • Published Oct 3, 2024 • 1
UNComp: Can Matrix Entropy Uncover Sparsity? -- A Compressor Design from an Uncertainty-Aware Perspective Paper • 2410.03090 • Published Oct 4, 2024 • 1