Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2508.19229

Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Computer Vision Tasks

Paper • 2310.19909 • Published Oct 30, 2023 • 21
Memory Augmented Language Models through Mixture of Word Experts

Paper • 2311.10768 • Published Nov 15, 2023 • 19
FlashDecoding++: Faster Large Language Model Inference on GPUs

Paper • 2311.01282 • Published Nov 2, 2023 • 38
Prompt Cache: Modular Attention Reuse for Low-Latency Inference

Paper • 2311.04934 • Published Nov 7, 2023 • 32

about 11 hours ago

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated Dec 17, 2025 • 15.3k • 1.44k
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

Paper • 2504.10449 • Published Apr 14, 2025 • 15
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct

Text Generation • 8B • Updated Apr 17, 2025 • 120 • 17
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15, 2025 • 63

Snowflake/Arctic-Text2SQL-R1-7B

8B • Updated May 29, 2025 • 4.95k • 71
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30, 2025 • 282
Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 265
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

Paper • 2506.16406 • Published Jun 19, 2025 • 133

Large Language Models Can Self-Improve in Long-context Reasoning

Paper • 2411.08147 • Published Nov 12, 2024 • 65
Reverse Thinking Makes LLMs Stronger Reasoners

Paper • 2411.19865 • Published Nov 29, 2024 • 23
Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 94
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published Dec 25, 2024 • 107

Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Computer Vision Tasks

Paper • 2310.19909 • Published Oct 30, 2023 • 21
Memory Augmented Language Models through Mixture of Word Experts

Paper • 2311.10768 • Published Nov 15, 2023 • 19
FlashDecoding++: Faster Large Language Model Inference on GPUs

Paper • 2311.01282 • Published Nov 2, 2023 • 38
Prompt Cache: Modular Attention Reuse for Low-Latency Inference

Paper • 2311.04934 • Published Nov 7, 2023 • 32

Snowflake/Arctic-Text2SQL-R1-7B

8B • Updated May 29, 2025 • 4.95k • 71
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30, 2025 • 282
Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 265
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

Paper • 2506.16406 • Published Jun 19, 2025 • 133

about 11 hours ago

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated Dec 17, 2025 • 15.3k • 1.44k
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

Paper • 2504.10449 • Published Apr 14, 2025 • 15
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct

Text Generation • 8B • Updated Apr 17, 2025 • 120 • 17
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15, 2025 • 63

Large Language Models Can Self-Improve in Long-context Reasoning

Paper • 2411.08147 • Published Nov 12, 2024 • 65
Reverse Thinking Makes LLMs Stronger Reasoners

Paper • 2411.19865 • Published Nov 29, 2024 • 23
Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 94
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published Dec 25, 2024 • 107

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs