Omartificial-Intelligence-Space/Pearl-vdr-ar-train-preprocessed
Viewer • Updated • 50k • 178 • 3
How to use Omartificial-Intelligence-Space/Qwen3-VL-Embedding-2B-Arabic-VDR with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("Omartificial-Intelligence-Space/Qwen3-VL-Embedding-2B-Arabic-VDR")
sentences = [
"بناءً على ما يظهر في الصورة، كيف يمكن تفسير تكيف هذا الطائر مع بيئته الصخرية والصحراوية؟",
"Fauna",
"Flora",
"Oman"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]
This is a sentence-transformers model trained on the pearl-vdr-ar-train-preprocessed dataset. It maps sentences & paragraphs to a 2048-dimensional dense vector space and can be used for retrieval.
SentenceTransformer(
(0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}, 'image': {'method': 'forward', 'method_output_name': 'last_hidden_state'}, 'video': {'method': 'forward', 'method_output_name': 'last_hidden_state'}, 'message': {'method': 'forward', 'method_output_name': 'last_hidden_state', 'format': 'structured'}}, 'module_output_name': 'token_embeddings', 'processing_kwargs': {'chat_template': {'add_generation_prompt': True}}, 'unpad_inputs': False, 'architecture': 'Qwen3VLModel'})
(1): Pooling({'embedding_dimension': 2048, 'pooling_mode': 'lasttoken', 'include_prompt': True})
(2): Normalize({})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Omartificial-Intelligence-Space/Qwen3-VL-Embedding-2B-Arabic-VDR")
# Run inference
queries = [
'ما اسم هذه الزهور البيضاء الصغيرة التي تنمو بين الصخور؟',
]
documents = [
'https://i.ibb.co/svZf6D92/image1.jpg',
'https://i.ibb.co/spFmq82S/image2.jpg',
'https://i.ibb.co/mF5BDDsB/image3.jpg'
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 2048] [3, 2048]
# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[ 0.5869, -0.1090, 0.1076]])
query, image, and negative_0| query | image | negative_0 | |
|---|---|---|---|
| type | string | image | image |
| details |
|
|
|
| query | image | negative_0 |
|---|---|---|
ما هي التحديات التي تواجه الحرف التقليدية كما يظهر في الصورة، وما هي الحلول الممكنة لمواجهة هذه التحديات؟ |
![]() |
![]() |
إذا شاركت في ورشة عمل لتعلم كيفية صنع الآلة التي يظهر في الصورة، ما هي الخطوات التي ستحتاج إلى اتباعها لصنعها بشكل صحيح؟ |
![]() |
![]() |
كيف يختلف العزف على الآلة التي يظهر في الصورة عن العزف على الآلات الوترية الأخرى في المنطقة، وما هي الخصائص الفريدة لهذه الآلة؟ |
![]() |
![]() |
MatryoshkaLoss with these parameters:{
"loss": "CachedMultipleNegativesRankingLoss",
"matryoshka_dims": [
2048,
1536,
1024,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
per_device_train_batch_size: 64num_train_epochs: 2learning_rate: 1e-05warmup_steps: 0.03bf16: Trueper_device_eval_batch_size: 64batch_sampler: no_duplicates@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{gao2021scaling,
title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
year={2021},
eprint={2101.06983},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
If you use this dataset or the accompanying benchmarks, please cite our paper:
@inproceedings{alwajih-etal-2025-pearl,
title = "Pearl: A Multimodal Culturally-Aware {A}rabic Instruction Dataset",
author = "Alwajih, Fakhraddin and
Magdy, Samar M. and
El Mekki, Abdellah and
Nacar, Omer and
Nafea, Youssef and
Abdelfadil, Safaa Taher and
Yahya, Abdulfattah Mohammed and
Luqman, Hamzah and
Almarwani, Nada and
Aloufi, Samah and
Qawasmeh, Baraah and
Atou, Houdaifa and
Sibaee, Serry and
Alsayadi, Hamzah A. and
Al-Dhabyani, Walid and
Al-shaibani, Maged S. and
El aatar, Aya and
Qandos, Nour and
Alhamouri, Rahaf and
Ahmad, Samar and
AL-Ghrawi, Mohammed Anwar and
Yacoub, Aminetou and
AbuHweidi, Ruwa and
Lemin, Vatimetou Mohamed and
Abdel-Salam, Reem and
Bashiti, Ahlam and
Ammar, Adel and
Alansari, Aisha and
Ashraf, Ahmed and
Alturayeif, Nora and
Alcoba Inciarte, Alcides and
Elmadany, AbdelRahim A. and
Tourad, Mohamedou Cheikh and
Berrada, Ismail and
Jarrar, Mustafa and
Shehata, Shady and
Abdul-Mageed, Muhammad",
editor = "Christodoulopoulos, Christos and
Chakraborty, Tanmoy and
Rose, Carolyn and
Peng, Violet",
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2025",
month = nov,
year = "2025",
address = "Suzhou, China",
publisher = "Association for Computational Linguistics",
url = "[https://aclanthology.org/2025.findings-emnlp.1254/](https://aclanthology.org/2025.findings-emnlp.1254/)",
pages = "23048--23079",
ISBN = "979-8-89176-335-7"
}