Running 6 Transformers Model Architectures 📐 6 Browse and filter transformer model architecture diagrams
nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16 Text Generation • 561B • Updated 24 days ago • 145k • • 258
Running on CPU Upgrade Featured 410 ML Intern 🤖 410 Explore machine learning tasks via an interactive web app
Running Featured 90 Distilling 100B+ Models 40x Faster with TRL 📝 90 TRL distillation for 100B+ teachers, 40x faster
Running on CPU Upgrade 263 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 263 Visualize synthetic‑data experiments as an interactive bookshelf
Running 17 The Jagged AI Frontier is a Data Frontier 🧭 17 Why AI capabilities are shaped by data availability