-
-
-
-
-
-
Inference Providers
Active filters: modelopt
nvidia/Llama-4-Maverick-17B-128E-Instruct-FP8
402B • Updated
• 428
• 12
nvidia/Llama-4-Scout-17B-16E-Instruct-FP8
109B • Updated
• 46k
• 11
ishan24/test_modelopt_quant
nvidia/Llama-4-Maverick-17B-128E-Eagle3
Updated
• 118
• 9
jiangchengchengNLP/L3.3-MS-Nevoria-70b-FP8
Text Generation
• 71B • Updated
NVFP4/Qwen3-30B-A3B-Instruct-2507-FP4
Text Generation
• 16B • Updated
• 75.7k
• 12
gesong2077/Qwen3-32B-NVFP4
19B • Updated
• 1
54B • Updated
• 1
nvidia/Phi-4-multimodal-instruct-NVFP4
4B • Updated
• 3.92k
• 7
nvidia/Phi-4-multimodal-instruct-FP8
6B • Updated
• 9.77k
• 5
nvidia/Phi-4-reasoning-plus-NVFP4
8B • Updated
• 1.25k
• 6
nvidia/Llama-3.1-8B-Instruct-NVFP4
5B • Updated
• 76.3k
• 7
Text Generation
• 5B • Updated
• 32.4k
• 15
Text Generation
• 8B • Updated
• 9.66k
• 4
Text Generation
• 8B • Updated
• 209k
• 5
Text Generation
• 15B • Updated
• 13.6k
• 4
nvidia/Qwen2.5-VL-7B-Instruct-FP8
Text Generation
• 8B • Updated
• 304
• 7
nvidia/Qwen2.5-VL-7B-Instruct-NVFP4
Text Generation
• 5B • Updated
• 28.9k
• 13
nuphoto-ian/Qwen3-8B-QAT-NVFP4
5B • Updated
txn545/Qwen3-Coder-30B-A3B-Instruct-NVFP4
16B • Updated
• 1
shanjiaz/gpt-oss-120b-nvfp4-modelopt
59B • Updated
• 1.74k
• 4
shanjiaz/gpt-oss-20b-nvfp4-modelopt
11B • Updated
• 996
nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1-FP4-QAD
Image-Text-to-Text
• Updated
• 423
• 14
baseten-admin/glm-4.6-fp4
177B • Updated
• 7
baseten-admin/glm-4.6-fp8
353B • Updated
• 1
baseten-admin/glm-4.6-fp4-mlp
183B • Updated
• 191
shinedays1993/Qwen3-30B-A3B-nvfp4
16B • Updated
shinedays1993/Qwen3-32B-nvfp4
17B • Updated
Beambutbetter/Deepseek-V2-Lite-16B-NVFP4
Text Generation
• 8B • Updated
• 27
• 3
ramblingpolymath/Qwen3-4B-Instruct-2507
2B • Updated