Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2511.19399

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

Paper • 2511.19399 • Published Nov 24, 2025 • 63
ResearchRubrics: A Benchmark of Prompts and Rubrics For Evaluating Deep Research Agents

Paper • 2511.07685 • Published Nov 10, 2025 • 10
OpenRubrics: Towards Scalable Synthetic Rubric Generation for Reward Modeling and LLM Alignment

Paper • 2510.07743 • Published Oct 9, 2025 • 14

Models and data associated with DR Tulu, http://allenai-web/papers/drtulu

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

Paper • 2511.19399 • Published Nov 24, 2025 • 63
rl-research/DR-Tulu-8B

Text Generation • 8B • Updated Feb 24 • 3.1k • 73
rl-research/DR-Tulu-SFT-8B

Text Generation • 8B • Updated Nov 29, 2025 • 333 • 5
rl-research/dr-tulu-sft-data

Viewer • Updated Nov 25, 2025 • 13.1k • 249 • 28

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9, 2025 • 105
Robot Learning from a Physical World Model

Paper • 2511.07416 • Published Nov 10, 2025 • 32
MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning

Paper • 2511.06805 • Published Nov 10, 2025 • 13
GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms

Paper • 2511.17592 • Published Nov 17, 2025 • 121

Reinforcement learning

Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning

Paper • 2407.20798 • Published Jul 30, 2024 • 24
Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 38
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4, 2025 • 104
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published Feb 25, 2025 • 75

General Agentic Memory Via Deep Research

Paper • 2511.18423 • Published Nov 23, 2025 • 170
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

Paper • 2511.19399 • Published Nov 24, 2025 • 63

Fathom-DeepResearch: Unlocking Long Horizon Information Retrieval and Synthesis for SLMs

Paper • 2509.24107 • Published Sep 28, 2025 • 80
Beyond Turn Limits: Training Deep Search Agents with Dynamic Context Window

Paper • 2510.08276 • Published Oct 9, 2025 • 10
DeepWideSearch: Benchmarking Depth and Width in Agentic Information Seeking

Paper • 2510.20168 • Published Oct 23, 2025 • 28
Enterprise Deep Research: Steerable Multi-Agent Deep Research for Enterprise Analytics

Paper • 2510.17797 • Published Oct 20, 2025 • 11

about 14 hours ago

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated Dec 17, 2025 • 15.5k • 1.44k
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

Paper • 2504.10449 • Published Apr 14, 2025 • 15
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct

Text Generation • 8B • Updated Apr 17, 2025 • 122 • 17
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15, 2025 • 63

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

Paper • 2511.19399 • Published Nov 24, 2025 • 63
ResearchRubrics: A Benchmark of Prompts and Rubrics For Evaluating Deep Research Agents

Paper • 2511.07685 • Published Nov 10, 2025 • 10
OpenRubrics: Towards Scalable Synthetic Rubric Generation for Reward Modeling and LLM Alignment

Paper • 2510.07743 • Published Oct 9, 2025 • 14

General Agentic Memory Via Deep Research

Paper • 2511.18423 • Published Nov 23, 2025 • 170
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

Paper • 2511.19399 • Published Nov 24, 2025 • 63

Models and data associated with DR Tulu, http://allenai-web/papers/drtulu

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

Paper • 2511.19399 • Published Nov 24, 2025 • 63
rl-research/DR-Tulu-8B

Text Generation • 8B • Updated Feb 24 • 3.1k • 73
rl-research/DR-Tulu-SFT-8B

Text Generation • 8B • Updated Nov 29, 2025 • 333 • 5
rl-research/dr-tulu-sft-data

Viewer • Updated Nov 25, 2025 • 13.1k • 249 • 28

Fathom-DeepResearch: Unlocking Long Horizon Information Retrieval and Synthesis for SLMs

Paper • 2509.24107 • Published Sep 28, 2025 • 80
Beyond Turn Limits: Training Deep Search Agents with Dynamic Context Window

Paper • 2510.08276 • Published Oct 9, 2025 • 10
DeepWideSearch: Benchmarking Depth and Width in Agentic Information Seeking

Paper • 2510.20168 • Published Oct 23, 2025 • 28
Enterprise Deep Research: Steerable Multi-Agent Deep Research for Enterprise Analytics

Paper • 2510.17797 • Published Oct 20, 2025 • 11

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9, 2025 • 105
Robot Learning from a Physical World Model

Paper • 2511.07416 • Published Nov 10, 2025 • 32
MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning

Paper • 2511.06805 • Published Nov 10, 2025 • 13
GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms

Paper • 2511.17592 • Published Nov 17, 2025 • 121

about 14 hours ago

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated Dec 17, 2025 • 15.5k • 1.44k
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

Paper • 2504.10449 • Published Apr 14, 2025 • 15
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct

Text Generation • 8B • Updated Apr 17, 2025 • 122 • 17
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15, 2025 • 63

Reinforcement learning

Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning

Paper • 2407.20798 • Published Jul 30, 2024 • 24
Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 38
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4, 2025 • 104
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published Feb 25, 2025 • 75

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs