Depth Anything 3 โ€” GGUF weights for depth-anything.cpp

Brought to you by the LocalAI team.

GGUF conversions of ByteDance Depth Anything 3, for use with depth-anything.cpp โ€” a from-scratch C++17 / ggml port. No Python, no PyTorch, no CUDA toolkit at inference: one self-contained GGUF file plus a small native library and CLI, faster than PyTorch on CPU and bit-exact against the original (correlation 1.0, verified component by component).

Given an image, the engine recovers a dense depth map, per-pixel confidence, camera extrinsics (3ร—4) and intrinsics (3ร—3), an optional sky mask, a back-projected 3D point cloud, and exports to glb / COLMAP / PLY.

Files in this repo

Each GGUF is fully self-contained โ€” every dimension, hyperparameter and preprocessing constant is baked into the file; the loader reads them, nothing is hardcoded.

File Source checkpoint Backbone Depth type Output
depth-anything-small-f32.gguf DA3-SMALL ViT-S relative depth + conf + pose
depth-anything-base-f32.gguf DA3-BASE ViT-B relative depth + conf + pose
depth-anything-base-f16.gguf DA3-BASE ViT-B relative depth + conf + pose
depth-anything-base-q8_0.gguf DA3-BASE ViT-B relative depth + conf + pose (near-lossless)
depth-anything-base-q4_k.gguf DA3-BASE ViT-B relative depth + conf + pose (99 MB)
depth-anything-large-f32.gguf DA3-LARGE ViT-L relative depth + conf + pose
depth-anything-giant-f32.gguf DA3-GIANT ViT-g relative depth + conf + pose + 3D Gaussians
depth-anything-mono-large-f32.gguf DA3MONO-LARGE ViT-L relative (monocular) depth + sky
depth-anything-metric-large-f32.gguf DA3METRIC-LARGE ViT-L metric metric depth + sky
depth-anything-nested-anyview.gguf DA3NESTED-GIANT-LARGE (anyview branch) ViT-g relative depth + conf + pose
depth-anything-nested-metric.gguf DA3NESTED-GIANT-LARGE (metric branch) ViT-L metric depth + sky

The nested model is a two-file pair: the engine loads the anyview (ViT-g) branch and the metric (ViT-L) branch together and aligns them to produce metric-scale depth + pose. Download both depth-anything-nested-anyview.gguf and depth-anything-nested-metric.gguf.

Which one should I use?

  • Just trying it out / CPU: depth-anything-base-q4_k.gguf (99 MB, near-lossless).
  • Best quality/speed default: depth-anything-base-q8_0.gguf.
  • Smallest / fastest: depth-anything-small-f32.gguf.
  • Highest quality + 3D reconstruction (point cloud / Gaussians): depth-anything-giant-f32.gguf.
  • Single-image depth with sky mask: depth-anything-mono-large-f32.gguf.
  • Metric-scale depth (meters), single model: depth-anything-metric-large-f32.gguf.
  • Best metric-scale depth + pose: the nested pair (depth-anything-nested-anyview.gguf + depth-anything-nested-metric.gguf).

Usage

depth-anything.cpp (CLI)

git clone https://github.com/mudler/depth-anything.cpp && cd depth-anything.cpp
cmake -B build -DCMAKE_BUILD_TYPE=Release && cmake --build build -j

# download a weight from this repo
hf download mudler/depth-anything.cpp-gguf depth-anything-base-q4_k.gguf --local-dir models

./build/da3 depth models/depth-anything-base-q4_k.gguf image.jpg --out depth.png
./build/da3 depth models/depth-anything-base-q4_k.gguf image.jpg --pose poses.json
./build/da3 reconstruct models/depth-anything-giant-f32.gguf image.jpg --ply cloud.ply

# metric-scale depth from the single metric model
./build/da3 depth models/depth-anything-metric-large-f32.gguf image.jpg --out depth.png

# metric-scale depth + pose from the nested pair (anyview + metric branches)
./build/da3 depth models/depth-anything-nested-anyview.gguf image.jpg \
    --metric-model models/depth-anything-nested-metric.gguf --pfm depth.pfm

See the README for multi-view, glb/COLMAP export, quantization and the flat C API.

LocalAI

local-ai run depth-anything-3-base

Performance

Faster than PyTorch on CPU at half the memory, bit-exact. AMD Ryzen 9 9950X3D, threads=16, 504ร—336, sustained:

engine quant model MB load ms infer ms peak RAM MB vs PyTorch
PyTorch f32 516 749 416.9 1328 1.00ร—
C++/ggml f32 393 112 346.4 614 1.20ร—
C++/ggml q8_0 142 40 319.4 363 1.31ร—
C++/ggml q4_k 99 25 395.2 320 1.05ร—

Full methodology in benchmarks/BENCHMARK.md.

License

The GGUF weights are derived from the official Depth Anything 3 checkpoints and inherit their Apache-2.0 license. The depth-anything.cpp code is MIT.

Citation

@article{depthanything3,
  title   = {Depth Anything 3: Recovering the Visual Space from Any Views},
  author  = {ByteDance Seed},
  year    = {2025}
}
Downloads last month
-
GGUF
Model size
0.1B params
Architecture
depthanything3
Hardware compatibility
Log In to add your hardware

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for mudler/depth-anything.cpp-gguf

Quantized
(2)
this model