How to use from
SGLang
Install from pip and serve model
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Accio-Lab/Metis-8B-ColdStart" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Accio-Lab/Metis-8B-ColdStart",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'
Use Docker images
docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Accio-Lab/Metis-8B-ColdStart" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Accio-Lab/Metis-8B-ColdStart",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'
Quick Links

Metis-8B-ColdStart

Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models

Metis-8B-ColdStart is the SFT (Supervised Fine-Tuning) checkpoint of the Metis framework, fine-tuned from Qwen3-VL-8B-Instruct on the curated Metis-ColdStart dataset. This checkpoint serves as the starting point for HDPO reinforcement learning, which produces the final Metis-8B-RL model.

[Paper (arXiv)] | [GitHub] | [RL Model] | [ColdStart Data] | [RL Data]

Model Details

Attribute Value
Base model Qwen3-VL-8B-Instruct
Training stage Supervised Fine-Tuning (Cold Start)
Training data Metis-ColdStart (~27K samples)
Next stage Metis-8B-RL (HDPO reinforcement learning)
License Apache-2.0

Cold Start Data Curation Pipeline

The SFT corpus is curated from publicly available tool-augmented multimodal trajectories (DeepEyesV2, V-Interaction, Thyme, OpenMMReasoner) through a rigorous three-stage pipeline:

  1. Eradicating hallucinated environmental dynamics — Execute all code in a sandbox environment; discard trajectories with execution failures.
  2. Isolating genuine tool necessity — Filter out samples where the base model achieves pass@8 = 1 without any tools, ensuring only genuinely tool-dependent samples remain.
  3. Multidimensional meta-cognitive filtering — An LLM judge evaluates visual relevance, reasoning coherence, and tool-use rationale to ensure high quality.

Training Pipeline

Qwen3-VL-8B-Instruct
        │
        ▼  SFT on Metis-ColdStart (~27K samples)
  Metis-8B-ColdStart  ← (this checkpoint)
        │
        ▼  HDPO on Metis-RL (~5K prompts)
   Metis-8B-RL  (final model)

Usage

Please refer to the GitHub repository for full installation and inference instructions.

Installation

git clone https://github.com/Accio-Lab/Metis.git
cd Metis
pip install -e verl
pip install -e ".[vllm,search_tool,python_code_dep]"

Citation

@article{yan2026metis,
  title={Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models},
  author={Yan, Shilin and Tong, Jintao and Xue, Hongwei and Tang, Xiaojun and Wang, Yangyang and Shi, Kunyu and Zhang, Guannan and Li, Ruixuan and Zou, Yixiong},
  journal={arXiv preprint arXiv:2604.08545},
  year={2026}
}

Acknowledgments

Metis is built upon verl, verl-tool, and Qwen3-VL.

Downloads last month
12
Safetensors
Model size
770k params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Accio-Lab/Metis-8B-ColdStart

Finetuned
(283)
this model
Finetunes
1 model

Dataset used to train Accio-Lab/Metis-8B-ColdStart

Paper for Accio-Lab/Metis-8B-ColdStart