How to use from
SGLang
Install from pip and serve model
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Delta-Vector/Austral-4.5B-Winton" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Delta-Vector/Austral-4.5B-Winton",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'
Use Docker images
docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Delta-Vector/Austral-4.5B-Winton" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Delta-Vector/Austral-4.5B-Winton",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'
Quick Links

Austral 4.5B Winton

Model banner
Trained by Delta-Vector

Overview

Austral 4.5B - Winton

AFM-Based KTO enhanced Adventure/Roleplay generalist 4.5B Sized model

More than 1.5-metres tall, about six-metres long and up to 1000-kilograms heavy, Australovenator Wintonensis was a fast and agile hunter. The largest known Australian theropod.

This is a finetune of arcee-ai/AFM-4.5B to be a generalist Roleplay/Adventure model. This was a multi-stage finetune (SFT->KTO), In testing it has shown to be a great model for Adventure cards & Roleplay, Often pushing the plot forward better then other models, While avoiding some of the slops you'd find in models from Drummer and Co. It also enhanced knowledge of roleplaying domains compared to the base.

Support my finetunes / Me on Kofi: https://Ko-fi.com/deltavector | Thank you to Auri/Joe for helping/Testing ♥

Quants

Quants Formats

  • GGUFFor use with LLama.cpp & Forks(Thanks Mradermacher!)
  • EXL3For use with TabbyAPI(Coming soon!)

Chat Format

This model utilizes ChatML.

<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>
<|im_start|>user
Can I ask a question?<|im_end|>
<|im_start|>assistant

Training

This model was trained over 4 epochs using 8 x 3090s for the base SFT, Then i used KTO to clean up some coherency issues for 1 epoch, Total time was roughly 8 hours.

Credits

TYSM to my friends: Auri, Minh, Trappu, Alicat, Kubernetes Bad, Intervitens, NyxKrage & Kalomaze

Downloads last month
12
Safetensors
Model size
5B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Delta-Vector/Austral-4.5B-Winton

Finetuned
(1)
this model
Quantizations
3 models

Datasets used to train Delta-Vector/Austral-4.5B-Winton

Collection including Delta-Vector/Austral-4.5B-Winton