nm-testing/Meta-Llama-3-8B-Instruct-MXFP4A16-GPTQ
nm-testing/Speculator-Qwen3-30B-MOE-VL-Eagle3
0.4B • Updated • 313
nm-testing/Qwen3-0.6B-FP8_BLOCK
0.6B • Updated • 3
nm-testing/Qwen3-0.6B-W4A16-G128
0.6B • Updated • 2
nm-testing/Llama-3.2-1B-Instruct-DEBUG-STRAWBERRY
nm-testing/Llama-3.2-1B-Instruct-DEBUG-COUNTER
nm-testing/TinyLlama-1.1B-compressed-tensors-kv-cache-scheme
Text Generation
• 1B • Updated • 237
nm-testing/TinyLlama-1.1B-Chat-v1.0-kvcache-fp8-attn_head
nm-testing/TinyLlama-1.1B-Chat-v1.0-kvcache-fp8-tensor
1B • Updated • 6.89k
nm-testing/Meta-Llama-3-8B-Instruct-awq-NVFP4
nm-testing/testing-llama3.1.8b-2layer-eagle3
nm-testing/CDH-test-nvfp4-awq
5B • Updated • 1
nm-testing/granite-4.0-h-small-FP8-dynamic
Text Generation
• 32B • Updated • 2
nm-testing/tinysmokeqwen3moe-W4A16-first-only-CTstable
2.93M • Updated • 17k
nm-testing/Llama-3.3-70B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.3-70B-Instruct-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.3-70B-Instruct-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.3-70B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Qwen3-32B-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.1-8B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.1-8B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.1-8B-Instruct-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.1-8B-Instruct-QKV-Cache-FP8-Per-Head
Updated
nm-testing/DeepSeek-R1-Distill-Qwen-32B-NVFP4
Text Generation
• 19B • Updated • 905
• 3
nm-testing/tinysmokeqwen3moe-W4A16-first-only
2.93M • Updated • 5
nm-testing/tinysmokeqwen3moe
2.93M • Updated • 1.64k
nm-testing/Meta-Llama-3-8B-Instruct-MXFP4
5B • Updated • 2