DINOV3
Collection
Vision Transformer (ViT) and ConvNeXt models trained using the DINOv3 method. • 8 items • Updated
Vision Transformer (ViT) and ConvNeXt models trained using the DINOv3 method.
Reference
DINOv3 offers a powerful, generalist visual backbone learned entirely from unlabeled images as described in DINOv3: Learning Robust Visual Features without Supervision.
Keras and KerasHub can be installed with:
pip install -U -q keras-hub
pip install -U -q keras
Jax, TensorFlow, and Torch come preinstalled in Kaggle Notebooks. For instructions on installing them in another environment see the Keras Getting Started page.
The following model checkpoints are provided by the Keras team. Weights have been ported from: https://huggingface.co. Full code examples for each are available below.
| Preset name | Parameters | Description |
|---|---|---|
| dinov3_vit_small_lvd1689m | 21.6M | Vision Transformer (small-sized model) trained on LVD-1689M using DINOv3. |
| dinov3_vit_small_plus_lvd1689m | 29M | Vision Transformer (small-plus-sized model) trained on LVD-1689M using DINOv3. |
| dinov3_vit_base_lvd1689m | 86M | Vision Transformer (base-sized model) trained on LVD-1689M using DINOv3. |
| dinov3_vit_large_lvd1689m | 300M | Vision Transformer (large-sized model) trained on LVD-1689M using DINOv3. |
| dinov3_vit_huge_plus_lvd1689m | 840M | Vision Transformer (huge-plus-sized model) trained on LVD-1689M using DINOv3. |
| dinov3_vit_7b_lvd1689m | 6.7B | Vision Transformer (7B-sized model) trained on LVD-1689M using DINOv3. |
| dinov3_vit_large_sat493m | 300M | Vision Transformer (large-sized model) trained on SAT-493M using DINOv3. |
| dinov3_vit_7b_sat493m | 6.7B | Vision Transformer (7B-sized model) trained on SAT-493M using DINOv3. |
All the models weights are under DINO V3 license: https://ai.meta.com/resources/models-and-libraries/dinov3-license/