See It from My Perspective: Diagnosing the Western Cultural Bias of Large Vision-Language Models in Image Understanding
Paper • 2406.11665 • Published • 1
How to use amitha/mllava-llama2-en with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("visual-question-answering", model="amitha/mllava-llama2-en", trust_remote_code=True) # Load model directly
from transformers import AutoModelForVisualQuestionAnswering
model = AutoModelForVisualQuestionAnswering.from_pretrained("amitha/mllava-llama2-en", trust_remote_code=True, dtype="auto")The English Llama2-7B-Chat VLM trained via LORA for https://arxiv.org/abs/2406.11665.