Instructions to use majentik/Kokoro-82M-MLX-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use majentik/Kokoro-82M-MLX-4bit with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir Kokoro-82M-MLX-4bit majentik/Kokoro-82M-MLX-4bit
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
Kokoro-82M MLX 4-bit
MLX 4-bit quantization of hexgrad/Kokoro-82M, produced with mlx-audio on Apple Silicon.
Provenance
Converted from mlx-community/Kokoro-82M-bf16, which in turn is a safetensors export of the original kokoro-v1_0.pth. The original weights remain bundled alongside the quantized model.safetensors so callers can fall back to bf16 where audio quality demands it.
Quantization notes
Kokoro is a StyleTTS2-derived architecture with many small LSTMs, istftnet convolution blocks, and normalization layers that are not eligible for mlx quantization at default thresholds. Reported "bits per weight" after conversion: ~27.6 — most parameters remained bf16. Only a subset of large linear projections were actually 4-bit quantized.
Practical takeaway:
- Savings vs bf16: ~13% on disk (270 MB vs 312 MB for the quantized weights file).
- Audio quality: indistinguishable in casual testing — the quantized layers are not on the critical synthesis path.
- If you want maximal disk savings, use the ONNX INT8 or GGUF variants from other authors.
Quickstart
from mlx_audio.tts import load
model = load("majentik/Kokoro-82M-MLX-4bit")
audio = model.generate(
text="Hello, this is a test of Kokoro 82M at 4-bit MLX.",
voice="af_heart", # one of the bundled voices in voices/
)
# audio is a numpy array at 24 kHz
Files
| File | Purpose |
|---|---|
model.safetensors |
Quantized weights (mlx format) |
kokoro-v1_0.safetensors |
Original bf16 weights (preserved) |
config.json |
Model config with model_type: kokoro |
voices/*.pt |
Voice embeddings (54 voices bundled) |
License
Apache 2.0, inherited from the upstream model. See the base model for training details and attribution.
See also
- Base: hexgrad/Kokoro-82M
- Other MLX variants: mlx-community/Kokoro-82M-bf16
- mlx-audio: github.com/Blaizzy/mlx-audio
- Garden hub: majentik/garden
- Downloads last month
- 43
4-bit