Prism with Qwen 2.5 0.5B backbone (Prismatic-Compatible Version)

This model is trained on the Llava-1.5-Instruct dataset. The official MiniVLA model uses a single vision encoder (Siglip), which is the key difference.

Usage Instructions

See the MiniVLA GitHub README for instructions on how to use this checkpoint for downstream training and finetuning.

Reference

BibTeX:

@article{belkhale24minivla,
    title={MiniVLA: A Better VLA with a Smaller Footprint},
    author={Suneel Belkhale and Dorsa Sadigh},
    url={https://github.com/Stanford-ILIAD/openvla-mini}
    year={2024}
} 
Downloads last month
9
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support