MOSS-Audio Collection An open-source audio understanding model supporting speech recognition, environmental sound analysis, music understanding, time-aware QA, and complex • 7 items • Updated about 3 hours ago • 52
Canary ASR/AST Collection A collection of multilingual and multitask speech to text models from NVIDIA NeMo 🐤 • 6 items • Updated 12 days ago • 34
Parakeet ASR Collection NeMo Parakeet ASR Models attain strong speech recognition accuracy while being efficient for inference. Available in CTC and RNN-Transducer variants. • 16 items • Updated 12 days ago • 70
view article Article How I contributed a new model to the Transformers library using Codex Mar 30 • 50
view article Article Raw Robot Video to VLA-Ready Training Data: Annotating LeRobot Datasets with Nomadic and HuggingFace Buckets Mar 21 • 17
ALARM Collection Official checkpoints and data for "ALARM: Audio–Language Alignment for Reasoning Models" • 8 items • Updated Mar 9 • 1
Music Flamingo: Scaling Music Understanding in Audio Language Models Paper • 2511.10289 • Published Nov 13, 2025 • 19
view article Article Introducing Modular Diffusers - Composable Building Blocks for Diffusion Pipelines +2 Mar 5 • 51
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 Feb 20 • 505