Configuration Parsing Warning:Invalid JSON for config file config.json
Kokoro-82M CoreML
End-to-end CoreML export of hexgrad/Kokoro-82M at FP16, optimized for Apple Neural Engine. Requires iOS 18+ / macOS 15+.
A single kokoro_5s.mlmodelc runs the full pipeline (BERT β duration
prediction β fixed-shape alignment β prosody β decoder) in one CoreML
call. G2P (grapheme-to-phoneme) is a separate pair of CoreML models.
Looking for a smaller variant? See aufklarer/Kokoro-82M-CoreML-INT8 β
INT8 k-means palettized, 83 MB vs 325 MB here, with log-spec distance
0.42 vs this FP16 reference on a validation utterance.
Model
| Parameter | Value |
|---|---|
| Parameters | 82M |
| Precision | FP16 |
| Max audio length | 5 s (200 frames @ 40 fps) |
| Sample rate | 24 kHz |
| Style dimension | 256 |
| Max phonemes per pass | 128 |
Files
| File | Size | Description |
|---|---|---|
kokoro_5s.mlmodelc |
325 MB | Pre-compiled E2E model (pre-compiled, loads directly on-device) |
G2PEncoder.mlmodelc |
0.7 MB | Grapheme-to-phoneme encoder |
G2PDecoder.mlmodelc |
0.8 MB | Grapheme-to-phoneme decoder |
voices/ |
0.5 MB | 54 preset voice embeddings (10 languages) |
vocab_index.json |
4 KB | Phoneme vocabulary |
g2p_vocab.json |
4 KB | G2P vocabulary |
us_gold.json, us_silver.json |
6 MB | English pronunciation dictionaries |
pipeline_config.json |
4 KB | Swift pipeline config |
Voices
54 preset voices across 10 languages: English (US/UK), Spanish, French, Hindi, Italian, Japanese, Korean, Portuguese, Chinese.
Usage
Add speech-swift to Package.swift:
.package(url: "https://github.com/soniqo/speech-swift", branch: "main")
Then synthesize:
import KokoroTTS
let tts = try await KokoroTTSModel.fromPretrained(
modelId: "aufklarer/Kokoro-82M-CoreML"
)
let audio = try await tts.synthesize(
"Hello world, this is a Kokoro test.",
voice: "af_heart"
)
CLI:
swift run audio kokoro "Hello world" --voice af_heart --output out.wav
Source
- Base model: hexgrad/Kokoro-82M (Apache-2.0)
- Dictionaries and G2P: Apache-2.0
License
- Model weights: Apache-2.0
- CoreML conversion: Apache-2.0
Links
- speech-swift β Apple SDK
- soniqo.audio β website
- MLX vs CoreML on Apple Silicon β a practical guide β related blog post
- soniqo.audio/blog β blog
- Downloads last month
- 3,192