Instructions to use Andyrasika/lora_gemma with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Andyrasika/lora_gemma with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Andyrasika/lora_gemma", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Unsloth Studio new
How to use Andyrasika/lora_gemma with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Andyrasika/lora_gemma to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Andyrasika/lora_gemma to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Andyrasika/lora_gemma to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="Andyrasika/lora_gemma", max_seq_length=2048, )
Uploaded model
- Developed by: Andyrasika
- License: apache-2.0
- Finetuned from model : unsloth/gemma-7b-bnb-4bit
This gemma model was trained 2x faster with Unsloth and Huggingface's TRL library.
if False:
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "Andyrasika/lora_gemma",
max_seq_length = max_seq_length,
dtype = dtype,
load_in_4bit = load_in_4bit,
)
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{}
### Input:
{}
### Response:
{}"""
inputs = tokenizer(
[
alpaca_prompt.format(
"What is a famous tall tower in Paris?", # instruction
"", # input
"", # output - leave this blank for generation!
)
], return_tensors = "pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True)
tokenizer.batch_decode(outputs)
Output
['<bos>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the
request.\n\n### Instruction:\nWhat is a famous tall tower in Paris?\n\n### Input:\n\n\n### Response:\nOne of the most famous tall towers in Paris is the Eiffel Tower.
It is a wrought-iron lattice tower on the Champ de Mars in Paris, France. It is named after the engineer Gustave Eiffel, whose company designed and built the tower.
The tower is 324 meters (1,063 feet']
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for Andyrasika/lora_gemma
Base model
unsloth/gemma-7b-bnb-4bit