Instructions to use ClaudioItaly/Cigno-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ClaudioItaly/Cigno-8B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ClaudioItaly/Cigno-8B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("ClaudioItaly/Cigno-8B") model = AutoModelForCausalLM.from_pretrained("ClaudioItaly/Cigno-8B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use ClaudioItaly/Cigno-8B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ClaudioItaly/Cigno-8B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ClaudioItaly/Cigno-8B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/ClaudioItaly/Cigno-8B
- SGLang
How to use ClaudioItaly/Cigno-8B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ClaudioItaly/Cigno-8B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ClaudioItaly/Cigno-8B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ClaudioItaly/Cigno-8B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ClaudioItaly/Cigno-8B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio new
How to use ClaudioItaly/Cigno-8B with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for ClaudioItaly/Cigno-8B to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for ClaudioItaly/Cigno-8B to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for ClaudioItaly/Cigno-8B to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="ClaudioItaly/Cigno-8B", max_seq_length=2048, ) - Docker Model Runner
How to use ClaudioItaly/Cigno-8B with Docker Model Runner:
docker model run hf.co/ClaudioItaly/Cigno-8B
🧠 Qwen 3 8B – LoRA ‘Gutenberg’
Creative minds require limitless memory.
Meet Qwen 3 8B – LoRA ‘Gutenberg’, a finely-tuned version of the Qwen 3 8B language model, enhanced with a LoRA trained on a carefully curated selection of literary texts from Project Gutenberg. This model blends the architectural sophistication of Qwen 3 with the timeless elegance of classical storytelling, producing text that feels both intelligent and human.
🌟 Highlights
🏛️ Gutenberg-powered creativity
Tuned on a literary dataset filled with 19th and 20th-century public domain novels, this model excels at generating rich, immersive prose and vivid atmospheric scenes.
🧬 Based on Qwen 3 8B
Built on Alibaba’s Qwen 3 architecture, providing strong multilingual capabilities, improved factual grounding, and efficient long-form reasoning.
🧠 Massive 40,960-token context window
Perfect for extended narrative continuity, legal documents, RAG pipelines, and deep dialogue memory. This wide context allows the model to remember and connect distant narrative threads with ease.
🔧 LoRA fine-tuning for creativity
Lightweight fine-tuning delivers powerful enhancements without compromising the model's base performance. Tailored for story generation, dialogue, and introspective monologues.
✍️ Ideal Use Cases
- Fiction and novel generation
- Interactive storytelling or RPG dialogue
- Literary assistants and writing aides
- Creative research, inspiration, and plot development
- Long-context memory testing and analysis
🧪 Example Output
Example Output – "Cigno 8B"
(Gutenberg-Fine-Tuned | Qwen 3 | 40k Context Window)The rain had stopped, but the clouds had gathered over the horizon like a silent army preparing to unleash a second wave of sorrow...
(continua con il tuo testo narrativo, già perfetto)
📦 Uploaded Model Details
- Developed by: ClaudioItaly
- License: apache-2.0
- Finetuned from model: unsloth/qwen3-8b-unsloth-bnb-4bit
This Qwen 3 model was trained 2x faster with Unsloth and Hugging Face's TRL library.
- Downloads last month
- 1

