Depth Anything 3 โ GGUF weights for depth-anything.cpp
Brought to you by the LocalAI team.
GGUF conversions of ByteDance Depth Anything 3, for use with depth-anything.cpp โ a from-scratch C++17 / ggml port. No Python, no PyTorch, no CUDA toolkit at inference: one self-contained GGUF file plus a small native library and CLI, faster than PyTorch on CPU and bit-exact against the original (correlation 1.0, verified component by component).
Given an image, the engine recovers a dense depth map, per-pixel confidence, camera extrinsics (3ร4) and intrinsics (3ร3), an optional sky mask, a back-projected 3D point cloud, and exports to glb / COLMAP / PLY.
Files in this repo
Each GGUF is fully self-contained โ every dimension, hyperparameter and preprocessing constant is baked into the file; the loader reads them, nothing is hardcoded.
| File | Source checkpoint | Backbone | Depth type | Output |
|---|---|---|---|---|
depth-anything-small-f32.gguf |
DA3-SMALL |
ViT-S | relative | depth + conf + pose |
depth-anything-base-f32.gguf |
DA3-BASE |
ViT-B | relative | depth + conf + pose |
depth-anything-base-f16.gguf |
DA3-BASE |
ViT-B | relative | depth + conf + pose |
depth-anything-base-q8_0.gguf |
DA3-BASE |
ViT-B | relative | depth + conf + pose (near-lossless) |
depth-anything-base-q4_k.gguf |
DA3-BASE |
ViT-B | relative | depth + conf + pose (99 MB) |
depth-anything-large-f32.gguf |
DA3-LARGE |
ViT-L | relative | depth + conf + pose |
depth-anything-giant-f32.gguf |
DA3-GIANT |
ViT-g | relative | depth + conf + pose + 3D Gaussians |
depth-anything-mono-large-f32.gguf |
DA3MONO-LARGE |
ViT-L | relative (monocular) | depth + sky |
depth-anything-metric-large-f32.gguf |
DA3METRIC-LARGE |
ViT-L | metric | metric depth + sky |
depth-anything-nested-anyview.gguf |
DA3NESTED-GIANT-LARGE (anyview branch) |
ViT-g | relative | depth + conf + pose |
depth-anything-nested-metric.gguf |
DA3NESTED-GIANT-LARGE (metric branch) |
ViT-L | metric | depth + sky |
The nested model is a two-file pair: the engine loads the anyview (ViT-g) branch and the metric (ViT-L) branch together and aligns them to produce metric-scale depth + pose. Download both
depth-anything-nested-anyview.ggufanddepth-anything-nested-metric.gguf.
Which one should I use?
- Just trying it out / CPU:
depth-anything-base-q4_k.gguf(99 MB, near-lossless). - Best quality/speed default:
depth-anything-base-q8_0.gguf. - Smallest / fastest:
depth-anything-small-f32.gguf. - Highest quality + 3D reconstruction (point cloud / Gaussians):
depth-anything-giant-f32.gguf. - Single-image depth with sky mask:
depth-anything-mono-large-f32.gguf. - Metric-scale depth (meters), single model:
depth-anything-metric-large-f32.gguf. - Best metric-scale depth + pose: the nested pair (
depth-anything-nested-anyview.gguf+depth-anything-nested-metric.gguf).
Usage
depth-anything.cpp (CLI)
git clone https://github.com/mudler/depth-anything.cpp && cd depth-anything.cpp
cmake -B build -DCMAKE_BUILD_TYPE=Release && cmake --build build -j
# download a weight from this repo
hf download mudler/depth-anything.cpp-gguf depth-anything-base-q4_k.gguf --local-dir models
./build/da3 depth models/depth-anything-base-q4_k.gguf image.jpg --out depth.png
./build/da3 depth models/depth-anything-base-q4_k.gguf image.jpg --pose poses.json
./build/da3 reconstruct models/depth-anything-giant-f32.gguf image.jpg --ply cloud.ply
# metric-scale depth from the single metric model
./build/da3 depth models/depth-anything-metric-large-f32.gguf image.jpg --out depth.png
# metric-scale depth + pose from the nested pair (anyview + metric branches)
./build/da3 depth models/depth-anything-nested-anyview.gguf image.jpg \
--metric-model models/depth-anything-nested-metric.gguf --pfm depth.pfm
See the README for multi-view, glb/COLMAP export, quantization and the flat C API.
LocalAI
local-ai run depth-anything-3-base
Performance
Faster than PyTorch on CPU at half the memory, bit-exact. AMD Ryzen 9 9950X3D, threads=16,
504ร336, sustained:
| engine | quant | model MB | load ms | infer ms | peak RAM MB | vs PyTorch |
|---|---|---|---|---|---|---|
| PyTorch | f32 | 516 | 749 | 416.9 | 1328 | 1.00ร |
| C++/ggml | f32 | 393 | 112 | 346.4 | 614 | 1.20ร |
| C++/ggml | q8_0 | 142 | 40 | 319.4 | 363 | 1.31ร |
| C++/ggml | q4_k | 99 | 25 | 395.2 | 320 | 1.05ร |
Full methodology in benchmarks/BENCHMARK.md.
License
The GGUF weights are derived from the official Depth Anything 3 checkpoints and inherit their Apache-2.0 license. The depth-anything.cpp code is MIT.
Citation
@article{depthanything3,
title = {Depth Anything 3: Recovering the Visual Space from Any Views},
author = {ByteDance Seed},
year = {2025}
}
- Downloads last month
- -
Model tree for mudler/depth-anything.cpp-gguf
Base model
depth-anything/DA3-BASE