Qwen2.5-7B-Instruct-LoRA-Alpaca-ZH

This is a LoRA fine-tuned version of Qwen/Qwen2.5-7B-Instruct on the Alpaca-GPT4-ZH dataset.

Model Details

  • Base Model: Qwen/Qwen2.5-7B-Instruct (7B parameters)
  • Training Method: QLoRA (4-bit quantization)
  • Trainable Parameters: 20.2M (0.46% of total)
  • Dataset: Alpaca-GPT4-ZH (500 samples)
  • Training Time: ~3.5 minutes
  • Hardware: Lambda Cloud A10 GPU (24GB)
  • Framework: ms-swift

Training Configuration

Model: Qwen/Qwen2.5-7B-Instruct
Training Type: LoRA
Quantization: 4-bit (BitsAndBytes)
LoRA Rank: 8
LoRA Alpha: 32
Target Modules: all-linear
Batch Size: 1
Gradient Accumulation: 4 steps
Learning Rate: 1e-4
Epochs: 1
Max Length: 2048
Training Loss: 1.395
GPU Memory: ~7GB

Usage

Using with Transformers + PEFT

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-7B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Load LoRA weights
model = PeftModel.from_pretrained(
    base_model,
    "FutureMa/Qwen2.5-7B-Instruct-LoRA-Alpaca-ZH"
)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B-Instruct")

# Generate response
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "解释什么是人工智能"}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
inputs = tokenizer([text], return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=0.7,
    top_p=0.9,
)

response = tokenizer.decode(
    outputs[0][len(inputs.input_ids[0]):],
    skip_special_tokens=True
)
print(response)

Using with ms-swift

# Inference with fine-tuned model
swift infer --ckpt_dir FutureMa/Qwen2.5-7B-Instruct-LoRA-Alpaca-ZH

Training Results

Comparison: Base vs Fine-tuned Model

Question: "给出三个健康饮食的建议"

Base Model Response:

  • Lengthy, detailed explanations
  • May exceed token limits
  • General knowledge-based

Fine-tuned Model Response:

  • Concise and structured (3 clear points)
  • Direct and actionable advice
  • Matches Alpaca dataset style
  • Complete within token limits

The fine-tuned model shows improved:

  • Response structure and clarity
  • Adherence to instruction format
  • Conciseness while maintaining quality
  • Better alignment with Chinese instruction-following tasks

Model Performance

  • Training Runtime: 206.66 seconds
  • Training Samples/Second: 2.42
  • Training Loss: 1.395
  • GPU Memory Usage: 7.03 GB (29% of 24GB)

Citation

If you use this model, please cite:

@misc{qwen2.5-7b-lora-alpaca-zh,
  author = {FutureMa},
  title = {Qwen2.5-7B-Instruct-LoRA-Alpaca-ZH},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/FutureMa/Qwen2.5-7B-Instruct-LoRA-Alpaca-ZH}}
}

License

This model is released under the Apache 2.0 license, following the base model's licensing.

Acknowledgments

Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for FutureMa/Qwen2.5-7B-Instruct-LoRA-Alpaca-ZH

Base model

Qwen/Qwen2.5-7B
Adapter
(804)
this model