Instructions to use numind/NuExtract-2.0-2B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use numind/NuExtract-2.0-2B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="numind/NuExtract-2.0-2B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("numind/NuExtract-2.0-2B") model = AutoModelForImageTextToText.from_pretrained("numind/NuExtract-2.0-2B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use numind/NuExtract-2.0-2B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "numind/NuExtract-2.0-2B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "numind/NuExtract-2.0-2B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/numind/NuExtract-2.0-2B
- SGLang
How to use numind/NuExtract-2.0-2B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "numind/NuExtract-2.0-2B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "numind/NuExtract-2.0-2B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "numind/NuExtract-2.0-2B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "numind/NuExtract-2.0-2B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use numind/NuExtract-2.0-2B with Docker Model Runner:
docker model run hf.co/numind/NuExtract-2.0-2B
Template generation performance about GPTQ model
#4
by troublecoder - opened
For the GPTQ model, the template generation function seems to work normally less frequently than for the base model.
Can you tell me how to fix this?
I am extracting the template by injecting an image with the prompt below.
[
{
"role": "system",
"content": [
{
"type": "text",
"text": "You are NuExtract, an information extraction tool created by NuMind.",
}
],
},
{
"role": "user",
"content": [
{
"type": "text",
"text": "Please create a template to extract key-value pairs from the following image. <|vision_start|><|image_pad|><|vision_end|>",
},
{
"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,{encode_image("XXX.jpg")}"},
},
],
},
]
troublecoder changed discussion status to closed
troublecoder changed discussion status to open
Which base model and which GPTQ model are you referring to?
Oh, sorry I'm writing on the 2B model's community.
I'm doing a comparison experiment between 8B and 8B-GPTQ.