Instructions to use ClaudioItaly/Cigno-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ClaudioItaly/Cigno-8B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="ClaudioItaly/Cigno-8B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("ClaudioItaly/Cigno-8B")
model = AutoModelForCausalLM.from_pretrained("ClaudioItaly/Cigno-8B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use ClaudioItaly/Cigno-8B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ClaudioItaly/Cigno-8B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ClaudioItaly/Cigno-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/ClaudioItaly/Cigno-8B

SGLang

How to use ClaudioItaly/Cigno-8B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "ClaudioItaly/Cigno-8B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ClaudioItaly/Cigno-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "ClaudioItaly/Cigno-8B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ClaudioItaly/Cigno-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio new

How to use ClaudioItaly/Cigno-8B with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for ClaudioItaly/Cigno-8B to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for ClaudioItaly/Cigno-8B to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for ClaudioItaly/Cigno-8B to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="ClaudioItaly/Cigno-8B",
    max_seq_length=2048,
)

Docker Model Runner
How to use ClaudioItaly/Cigno-8B with Docker Model Runner:
```
docker model run hf.co/ClaudioItaly/Cigno-8B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

🧠 Qwen 3 8B – LoRA ‘Gutenberg’

Creative minds require limitless memory.
Meet Qwen 3 8B – LoRA ‘Gutenberg’, a finely-tuned version of the Qwen 3 8B language model, enhanced with a LoRA trained on a carefully curated selection of literary texts from Project Gutenberg. This model blends the architectural sophistication of Qwen 3 with the timeless elegance of classical storytelling, producing text that feels both intelligent and human.

🌟 Highlights

🏛️ Gutenberg-powered creativity

Tuned on a literary dataset filled with 19th and 20th-century public domain novels, this model excels at generating rich, immersive prose and vivid atmospheric scenes.

🧬 Based on Qwen 3 8B

Built on Alibaba’s Qwen 3 architecture, providing strong multilingual capabilities, improved factual grounding, and efficient long-form reasoning.

🧠 Massive 40,960-token context window

Perfect for extended narrative continuity, legal documents, RAG pipelines, and deep dialogue memory. This wide context allows the model to remember and connect distant narrative threads with ease.

🔧 LoRA fine-tuning for creativity

Lightweight fine-tuning delivers powerful enhancements without compromising the model's base performance. Tailored for story generation, dialogue, and introspective monologues.

✍️ Ideal Use Cases

Fiction and novel generation
Interactive storytelling or RPG dialogue
Literary assistants and writing aides
Creative research, inspiration, and plot development
Long-context memory testing and analysis

🧪 Example Output

Example Output – "Cigno 8B"
(Gutenberg-Fine-Tuned | Qwen 3 | 40k Context Window)

The rain had stopped, but the clouds had gathered over the horizon like a silent army preparing to unleash a second wave of sorrow...
(continua con il tuo testo narrativo, già perfetto)

📦 Uploaded Model Details

Developed by: ClaudioItaly
License: apache-2.0
Finetuned from model: unsloth/qwen3-8b-unsloth-bnb-4bit

This Qwen 3 model was trained 2x faster with Unsloth and Hugging Face's TRL library.

Downloads last month: 1

Model tree for ClaudioItaly/Cigno-8B

Quantizations

2 models