Instructions to use nnsohamnn/Qwen2.5-3B-ReTrace-OpenO1-Merged with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Inference
🧠 Qwen2.5-3B-Instruct ReTrace-OpenO1 Merged
A reasoning-focused model trained on 5,000 chain-of-thought examples
📋 Model Description
This is a fully merged model of Qwen2.5-3B-Instruct fine-tuned with LoRA on 5,000 reasoning samples (500 ReTrace + 4,500 OpenO1-SFT). The model generates structured reasoning with explicit <Thought> and <Output> tags, demonstrating enhanced step-by-step problem-solving capabilities.
🎯 Key Features
- ✅ Fully Merged: Ready-to-use model (no adapter loading needed)
- ✅ Structured Reasoning: Outputs thinking in
<Thought>tags, final answer in<Output>tags - ✅ 5K Training Samples: 500 ReTrace + 4,500 OpenO1-SFT examples
- ✅ Multi-Domain: Math, logic, word problems, and general reasoning
- ✅ Production Ready: FP16, 6GB model size
📊 Training Loss
📈 Training Statistics
| Metric | Value |
|---|---|
| Initial Loss | 1.3374 |
| Final Loss | 0.6798 |
| Best Loss | 0.6662 (Step 240) |
| Improvement | 49.2% ↓ |
| Total Steps | 310 |
⚙️ Training Configuration
# Model
BASE_MODEL = "Qwen/Qwen2.5-3B-Instruct"
MAX_SEQ_LENGTH = 4096
# LoRA
LORA_R = 64
LORA_ALPHA = 128
LORA_DROPOUT = 0.05
# Training
BATCH_SIZE = 8
GRADIENT_ACCUMULATION = 4
LEARNING_RATE = 2e-4
NUM_EPOCHS = 2
WARMUP_STEPS = 50
# Datasets
- 500 samples from ReTrace501-v1
- 4,500 samples from OpenO1-SFT
🚀 Usage
Installation
pip install torch transformers accelerate
Quick Inference
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
# =========================
# Load model and tokenizer
# =========================
model_name = "nnsohamnn/Qwen2.5-3B-ReTrace-OpenO1-Merged"
tokenizer = AutoTokenizer.from_pretrained(
model_name,
trust_remote_code=True
)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float16,
device_map="auto",
trust_remote_code=True
)
# =========================
# LLM question function
# =========================
def ask_llm(question: str):
messages = [
{
"role": "system",
"content": (
"You are a helpful AI assistant. When solving problems, show your detailed reasoning process inside <Thought> tags, then provide your final answer inside <Output> tags and explain the final answer from reasoning in short. Break down complex problems step-by-step."
)
},
{
"role": "user",
"content": question
}
]
prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=1024,
temperature=0.7,
top_p=0.9
)
# Correct slicing
prompt_len = inputs["input_ids"].shape[1]
response = tokenizer.decode(
outputs[0][prompt_len:],
skip_special_tokens=True
)
return response
# =========================
# Change ONLY this block
# =========================
question = """
A machine produces items where 4% of the output is defective. A quality control test correctly identifies a defective item with probability 0.95 and incorrectly labels a good item as defective with probability 0.03. If an item is selected at random and the test reports it as defective, determine the probability that the item is actually defective.
"""
print(ask_llm(question))
Expected Output
Question
A machine produces items where 4% of the output is defective. A quality control test correctly identifies a defective item with probability 0.95 and incorrectly labels a good item as defective with probability 0.03. If an item is selected at random and the test reports it as defective, determine the probability that the item is actually defective.
<Thought>
Let's define the events:
- \( D \): The event that the item is defective.
- \( D^c \): The event that the item is not defective.
- \( T \): The event that the test reports the item as defective.
Given probabilities:
- \( P(D) = 0.04 \) (4% defective)
- \( P(T|D) = 0.95 \) (Test correctly identifies defective items)
- \( P(T|D^c) = 0.03 \) (Test incorrectly labels good items as defective)
We need to find \( P(D|T) \), the probability that the item is defective given that the test reports it as defective.
Using Bayes' theorem:
\[
P(D|T) = \frac{P(T|D)P(D)}{P(T)}
\]
First, we need to find \( P(T) \), the total probability that the test reports a defective item. This can be found using the law of total probability:
\[
P(T) = P(T|D)P(D) + P(T|D^c)P(D^c)
\]
Calculate each term:
\[
P(D^c) = 1 - P(D) = 1 - 0.04 = 0.96
\]
\[
P(T|D^c) = 0.03
\]
\[
P(T) = (0.95)(0.04) + (0.03)(0.96) = 0.038 + 0.0288 = 0.0668
\]
Now, substitute back into Bayes' theorem:
\[
P(D|T) = \frac{(0.95)(0.04)}{0.0668} = \frac{0.038}{0.0668} \approx 0.572
\]
So, the probability that the item is actually defective given that the test reports it as defective is approximately 57.2%.
</Thought>
<Output>
The probability that the item is actually defective given that the test reports it as defective is approximately 57.2%.
</Output>
📚 Training Datasets
ReTrace501-v1 (500 samples)
High-quality chain-of-thought reasoning examples focusing on mathematical problem-solving with explicit reasoning steps.
Source: nnsohamnn/ReTrace501-v1
OpenO1-SFT (4,500 samples)
Diverse reasoning dataset covering multiple domains including logic, math, science, and general problem-solving.
Source: O1-OPEN/OpenO1-SFT
🔧 Technical Details
| Component | Specification |
|---|---|
| Architecture | Qwen2.5 Transformer |
| Parameters | 3.09 Billion |
| Context Length | 4096 tokens |
| Precision | FP16 |
| Training Framework | Unsloth + HuggingFace Transformers |
📖 Citation
@misc{qwen25-retrace-openo1-merged,
author = {nnsohamnn},
title = {Qwen2.5-3B ReTrace-OpenO1 Merged},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/nnsohamnn/Qwen2.5-3B-ReTrace-OpenO1-Merged}
}
🔗 Related Resources
- LoRA Adapters: nnsohamnn/Qwen2.5-3B-ReTrace-OpenO1-5k-QLoRA
- Base Model: Qwen/Qwen2.5-3B-Instruct
- Demo Space: Try it live!
🙏 Acknowledgments
- Qwen Team for the excellent base model
- Unsloth AI for efficient training tools
- OpenO1 communities for high-quality datasets
📝 License
Apache 2.0 - See LICENSE for details.
- Downloads last month
- 14
