pplx-embed Collection Diffusion-Pretrained Dense and Contextual Embeddings • 7 items • Updated 8 days ago • 84
ColBERT-Zero 🐶 Collection First large-scale fully pre-trained ColBERT model using only public data, outperforming GTE-ModernColBERT and GTE-ModernBERT • 10 items • Updated 3 days ago • 17
GLiClass-Instruct Collection Multi-task efficient zero-shot sequence classification models • 3 items • Updated 17 days ago • 4
jina-embeddings-v5-text Collection Our 5th-gen embeddings: two lightweight multilingual models with SOTA performance in retrieval, matching, clustering, and classification. • 29 items • Updated 7 days ago • 33
view article Article Firecracker vs Docker: The Technical Boundary Between MicroVMs and Containers Nov 6, 2025 • 2
view article Article LateOn-Code & ColGrep: LightOn unveils state-of-the-art code retrieval models and code search tooling 21 days ago • 47
NeuTTS Air Collection NeuTTS Air is a speech foundation model that runs on CPU in real-time, with instant voice cloning. • 3 items • Updated 22 days ago • 21
NeuTTS Nano Multilingual Collection Collection NeuTTS Nano is a TTS model, 3x smaller than NeuTTS Air, that runs on CPU in real-time - now in English, Spanish, French, and German versions! • 13 items • Updated 8 days ago • 16
mmBERT: a modern multilingual encoder Collection mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance • 16 items • Updated Sep 9, 2025 • 52
Falcon-H1-Tiny Collection A series of extremely small, yet powerful language models redefining capabilities at small scale • 19 items • Updated 4 days ago • 35
VibeVoice Collection Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ • 8 items • Updated 4 days ago • 210
Tarka Embed V1 Collection Efficient DFKD embeddings for language understanding • 5 items • Updated Dec 17, 2025 • 6
GLiNER-PII Collection PII detection models developed in collaboration with Wordcab • 5 items • Updated Jan 29 • 22
MobileLLM-R1 Collection MobileLLM-R1, a series of sub-billion parameter reasoning models • 10 items • Updated Nov 21, 2025 • 27