CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 12 days ago • 263
MinT: Managed Infrastructure for Training and Serving Millions of LLMs Paper • 2605.13779 • Published 12 days ago • 217
Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation Paper • 2605.03849 • Published 20 days ago • 124
Tstars-Tryon 1.0: Robust and Realistic Virtual Try-On for Diverse Fashion Items Paper • 2604.19748 • Published Apr 21 • 250
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published Apr 22 • 240
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 629
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 503
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger Paper • 2602.08222 • Published Feb 9 • 290
Green-VLA: Staged Vision-Language-Action Model for Generalist Robots Paper • 2602.00919 • Published Jan 31 • 325
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published Feb 5 • 353
Multimodal AI Models Collection Purpose: Models that understand text + image + audio together. • 5 items • Updated Jan 23 • 1
Audio & Speech Models Collection Purpose: Speech recognition, text-to-speech, music, audio analysis. • 5 items • Updated Jan 23 • 1
Vision Models (Image & Video) Collection Purpose: Text-to-image, image classification, detection, segmentation. • 5 items • Updated Jan 23 • 1
Text & Code Models (NLP) Collection Purpose: Text generation, summarization, translation, embeddings, coding. • 5 items • Updated Jan 23 • 1