Reasoning Transfer

classroom

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

aaabiao authored a paper 8 days ago

Aligning Instruction Tuning with Pre-training

aaabiao authored a paper 8 days ago

YuE: Scaling Open Foundation Models for Long-Form Music Generation

aaabiao authored a paper 8 days ago

IWR-Bench: Can LVLMs reconstruct interactive webpage from a user interaction video?

View all activity

authored 8 papers 8 days ago

Aligning Instruction Tuning with Pre-training

Paper • 2501.09368 • Published Jan 16, 2025

YuE: Scaling Open Foundation Models for Long-Form Music Generation

Paper • 2503.08638 • Published Mar 11, 2025 • 73

IWR-Bench: Can LVLMs reconstruct interactive webpage from a user interaction video?

Paper • 2509.24709 • Published Sep 29, 2025 • 7

Beyond Correctness: Evaluating Subjective Writing Preferences Across Cultures

Paper • 2510.14616 • Published Oct 16, 2025 • 13

COIG-Writer: A High-Quality Dataset for Chinese Creative Writing with Thought Processes

Paper • 2510.14763 • Published Oct 16, 2025 • 14

Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space

Paper • 2512.24617 • Published Dec 31, 2025 • 66

Search More, Think Less: Rethinking Long-Horizon Agentic Search for Efficiency and Generalization

Paper • 2602.22675 • Published Feb 26 • 23

LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling

Paper • 2606.18023 • Published 9 days ago • 203

authored a paper 3 months ago

InCoder-32B: Code Foundation Model for Industrial Scenarios

Paper • 2603.16790 • Published Mar 17 • 312

authored 4 papers 6 months ago

Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

Paper • 2510.24702 • Published Oct 28, 2025 • 31

The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution

Paper • 2510.25726 • Published Oct 29, 2025 • 47

Simulating Environments with Reasoning Models for Agent Training

Paper • 2511.01824 • Published Nov 3, 2025 • 2

On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models

Paper • 2512.07783 • Published Dec 8, 2025 • 40

updated a model 9 months ago

ReasoningTransferability/UniReason-Qwen3-14B-think-SFT

Text Generation • 15B • Updated Sep 28, 2025 • 6

authored a paper 10 months ago

TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling

Paper • 2508.17445 • Published Aug 24, 2025 • 80

updated 2 models 10 months ago

ReasoningTransferability/UniReason-Qwen3-14B-no-think-SFT

Text Generation • 15B • Updated Aug 25, 2025 • 6 • 1

ReasoningTransferability/UniReason-Qwen3-14B-RL

Text Generation • 15B • Updated Aug 25, 2025 • 19 • 3

updated a dataset 12 months ago

ReasoningTransferability/math_rl_48k

Viewer • Updated Jul 11, 2025 • 48.8k • 6

published a dataset 12 months ago

ReasoningTransferability/math_rl_48k

Viewer • Updated Jul 11, 2025 • 48.8k • 6

authored a paper 12 months ago

First Return, Entropy-Eliciting Explore

Paper • 2507.07017 • Published Jul 9, 2025 • 24