Instructions to use kekchpek/idlm-duo with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use kekchpek/idlm-duo with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="kekchpek/idlm-duo", trust_remote_code=True)# Load model directly from transformers import AutoModelForMaskedLM model = AutoModelForMaskedLM.from_pretrained("kekchpek/idlm-duo", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use kekchpek/idlm-duo with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "kekchpek/idlm-duo" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kekchpek/idlm-duo", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/kekchpek/idlm-duo
- SGLang
How to use kekchpek/idlm-duo with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "kekchpek/idlm-duo" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kekchpek/idlm-duo", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "kekchpek/idlm-duo" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kekchpek/idlm-duo", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use kekchpek/idlm-duo with Docker Model Runner:
docker model run hf.co/kekchpek/idlm-duo
IDLM-Duo
IDLM-Duo is an Inverse-distilled Diffusion Language Model distilled from the pretrained Duo diffusion language model. It is released with the paper IDLM: Inverse-distilled Diffusion Language Models.
IDLM extends inverse distillation to discrete token spaces. Instead of running a pretrained diffusion language model for hundreds or thousands of reverse-diffusion steps, IDLM trains a few-step student generator using an auxiliary fake model and the teacher diffusion objective.
- Project page: https://david-cripto.github.io/idlm-project-page/
- Code: https://github.com/David-cripto/IDLM
- Paper: https://arxiv.org/abs/2602.19066
Model Details
- Model family: IDLM, discrete diffusion language model
- Teacher checkpoint:
s-sahoo/duo - Diffusion type: uniform-state / Duo-style diffusion
- Training data: OpenWebText
- Tokenizer: GPT-2 tokenizer
- Context length: 1024 tokens
- Parameters: 169,627,250
- Tensor type: F32 Safetensors
- Architecture config: 12 blocks, 12 heads, hidden size 768, conditioning dimension 128, dropout 0.1
- License: MIT
Intended Use
This checkpoint is intended for research on diffusion language models, inverse distillation, and few-step sampling.
Installation
The sampling code depends on CUDA and FlashAttention.
git clone https://github.com/David-cripto/IDLM.git
cd IDLM
conda create -n idlm python=3.12
conda activate idlm
conda install nvidia/label/cuda-12.4.0::cuda-toolkit
pip install -r requirements.txt
pip install flash_attn==2.7.4.post1
Loading the Checkpoint
The Hugging Face repository contains custom model code. Use trust_remote_code=True.
from transformers import AutoModelForMaskedLM, AutoTokenizer
model_id = "kekchpek/idlm-duo"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForMaskedLM.from_pretrained(
model_id,
trust_remote_code=True,
)
Direct AutoModelForMaskedLM loading exposes the denoising network. For text generation, use the sampler in the official IDLM repository.
Generate Samples
mkdir -p samples
python -m main \
mode=sample_eval \
loader.batch_size=2 \
loader.eval_batch_size=8 \
data=openwebtext-split \
algo=duo \
algo.backbone=hf_dit \
eval.checkpoint_path=kekchpek/idlm-duo \
sampling.steps=16 \
sampling.num_sample_batches=10 \
sampling.noise_removal=greedy \
+wandb.offline=true \
eval.generated_samples_path=samples/idlm_duo_16steps.json
The generation script can be swept with different sampling steps. The paper reports both ancestral (a) and Greedy-Tail (g) sampling variants.
Evaluation
The paper reports generation perplexity (GenPPL, lower is better) and sample entropy (higher is better) on OpenWebText-style generation. The released evaluation code defaults to gpt2-large for GenPPL.
| Sampling steps | Sampler | GenPPL (lower is better) | Entropy (higher is better) |
|---|---|---|---|
| 32 | Greedy-Tail | 54.05 | 5.49 |
| 16 | Greedy-Tail | 68.04 | 5.55 |
| 8 | Greedy-Tail | 93.00 | 5.56 |
| 4 | Greedy-Tail | 144.74 | 4.28 |
| 32 | Ancestral | 63.10 | 5.54 |
| 16 | Ancestral | 78.00 | 5.58 |
| 8 | Ancestral | 117.88 | 5.62 |
| 4 | Ancestral | 495.85 | 5.56 |
For comparison, the Duo teacher is reported at 1024 steps with GenPPL 71.72 / entropy 5.22 under Greedy-Tail sampling and GenPPL 77.69 / entropy 5.55 under ancestral sampling.
Training Summary
IDLM-Duo was trained by initializing the student and fake model from the pretrained Duo teacher and alternating between:
- Updating the fake model on student-generated samples using the teacher diffusion loss.
- Updating the student using the teacher-fake loss gap.
The Duo setting uses a Gaussian relaxation and soft token inputs for stable backpropagation through the diffusion objective.
Citation
@article{li2026idlm,
title={IDLM: Inverse-distilled Diffusion Language Models},
author={Li, David and Gushchin, Nikita and Abulkhanov, Dmitry and Moulines, Eric and Oseledets, Ivan and Panov, Maxim and Korotin, Alexander},
journal={arXiv preprint arXiv:2602.19066},
year={2026}
}
- Downloads last month
- 54
Model tree for kekchpek/idlm-duo
Base model
s-sahoo/duo