COMPASS Qwen2.5-7B-Instruct LoRA (Policy-aware LODO SFT)

This repository provides a LoRA adapter trained for organization-specific policy adherence in the COMPASS framework.

Training Data

Policy-aware SFT dataset built from COMPASS scenarios:

  • Setup: Leave-One-Domain-Out (LODO)
  • Held-out domain: TelePath (Telecom)
  • Train domains (7): AutoViaMotors, CityGov, FinSecure, MediCarePlus, PlanMyTrip, TutoraVerse, VirtuRecruit
  • Training size: 4,121 query–response pairs

Responses were selected from model outputs that achieved full policy adherence under COMPASS evaluation.

Training Configuration

  • Method: LoRA adapters
  • Epochs: 3
  • LoRA rank (r): 64
  • LoRA alpha: 128
  • Peak learning rate: 5e-4
  • Optimizer: AdamW
  • Batch size: 32
  • LR schedule: cosine
  • Quantization: 8-bit during training

Evaluation (Held-out TelePath Domain)

Policy Alignment Score (PAS) breakdown on TelePath:

Model Method Allowed Base Allowed Edge Denied Base Denied Edge
Qwen2.5-7B-Instruct Base system prompt 96.67 85.71 24.00 0.00
Qwen2.5-7B-Instruct LODO SFT (LoRA) 96.67 89.52 71.74 60.49

Citation

@misc{choi2026compass,
      title={COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs}, 
      author={Dasol Choi and DongGeon Lee and Brigitta Jesica Kartono and Helena Berndt and Taeyoun Kwon and Joonwon Jang and Haon Park and Hwanjo Yu and Minsuk Kahng},
      year={2026},
      eprint={2601.01836},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2601.01836}, 
}
Downloads last month
6
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AIM-Intelligence/COMPASS_Qwen2.5-7B-Instruct_LoRA

Base model

Qwen/Qwen2.5-7B
Adapter
(824)
this model

Dataset used to train AIM-Intelligence/COMPASS_Qwen2.5-7B-Instruct_LoRA

Collection including AIM-Intelligence/COMPASS_Qwen2.5-7B-Instruct_LoRA

Paper for AIM-Intelligence/COMPASS_Qwen2.5-7B-Instruct_LoRA