Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

speculative-decoding

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

98

Full-text search

Active filters: speculative-decoding

festr2/GLM-5-NVFP4-MTP

435B • Updated 6 days ago • 2.84k

Cloudriver/MSD-Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • Updated 6 days ago • 96

Cloudriver/EAGLE3-Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • Updated 5 days ago • 69

dealignai/Qwen3.5-VL-2B-8bit-MLX-CRACK

Image-Text-to-Text • 0.9B • Updated about 5 hours ago • 324

ping69852/Medusa-LLaVA1.5-7B

Image-Text-to-Text • Updated 3 days ago • 26

ping69852/Medusa-Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • Updated 3 days ago • 21

lightseekorg/kimi-k2.5-eagle3

3B • Updated about 16 hours ago • 49

BLR2/Qwen3.5-9B-Eagle3-ShareGPT

Updated about 5 hours ago