NVIDIA/kokoro
JavaScript
Captured source
source ↗NVIDIA/kokoro
Description: https://hf.co/hexgrad/Kokoro-82M
Language: JavaScript
License: Apache-2.0
Stars: 6
Forks: 0
Open issues: 0
Created: 2025-12-19T13:24:40Z
Pushed: 2025-12-19T13:27:54Z
Default branch: main
Fork: no
Archived: no
README:
kokoro
An inference library for Kokoro-82M. You can `pip install kokoro`.
> Kokoro is an open-weight TTS model with 82 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, Kokoro can be deployed anywhere from production environments to personal projects.
Usage
You can run this basic cell on Google Colab. Listen to samples.
!pip install -q kokoro>=0.9.4 soundfile
!apt-get -qq -y install espeak-ng > /dev/null 2>&1
from kokoro import KPipeline
from IPython.display import display, Audio
import soundfile as sf
import torch
pipeline = KPipeline(lang_code='a')
text = '''
[Kokoro](/kˈOkəɹO/) is an open-weight TTS model with 82 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, [Kokoro](/kˈOkəɹO/) can be deployed anywhere from production environments to personal projects.
'''
generator = pipeline(text, voice='af_heart')
for i, (gs, ps, audio) in enumerate(generator):
print(i, gs, ps)
display(Audio(data=audio, rate=24000, autoplay=i==0))
sf.write(f'{i}.wav', audio, 24000)Under the hood, kokoro uses `misaki`, a G2P library at https://github.com/hexgrad/misaki
Advanced Usage
You can run this advanced cell on Google Colab.
# 1️⃣ Install kokoro
!pip install -q kokoro>=0.9.4 soundfile
# 2️⃣ Install espeak, used for English OOD fallback and some non-English languages
!apt-get -qq -y install espeak-ng > /dev/null 2>&1
# 3️⃣ Initalize a pipeline
from kokoro import KPipeline
from IPython.display import display, Audio
import soundfile as sf
import torch
# 🇺🇸 'a' => American English, 🇬🇧 'b' => British English
# 🇪🇸 'e' => Spanish es
# 🇫🇷 'f' => French fr-fr
# 🇮🇳 'h' => Hindi hi
# 🇮🇹 'i' => Italian it
# 🇯🇵 'j' => Japanese: pip install misaki[ja]
# 🇧🇷 'p' => Brazilian Portuguese pt-br
# 🇨🇳 'z' => Mandarin Chinese: pip install misaki[zh]
pipeline = KPipeline(lang_code='a') # index
print(gs) # gs => graphemes/text
print(ps) # ps => phonemes
display(Audio(data=audio, rate=24000, autoplay=i==0))
sf.write(f'{i}.wav', audio, 24000) # save each audio fileWindows Installation
To install espeak-ng on Windows: 1. Go to espeak-ng releases 2. Click on Latest release 3. Download the appropriate *.msi file (e.g. espeak-ng-20191129-b702b03-x64.msi) 4. Run the downloaded installer
For advanced configuration and usage on Windows, see the official espeak-ng Windows guide
MacOS Apple Silicon GPU Acceleration
On Mac M1/M2/M3/M4 devices, you can explicitly specify the environment variable PYTORCH_ENABLE_MPS_FALLBACK=1 to enable GPU acceleration.
PYTORCH_ENABLE_MPS_FALLBACK=1 python run-your-kokoro-script.py
Conda Environment
Use the following conda environment.yml if you're facing any dependency issues.
name: kokoro channels: - defaults dependencies: - python==3.9 - libstdcxx~=12.4.0 # Needed to load espeak correctly. Try removing this if you're facing issues with Espeak fallback. - pip: - kokoro>=0.3.1 - soundfile - misaki[en]
Acknowledgements
- 🛠️ @yl4579 for architecting StyleTTS 2.
- 🏆 @Pendrokar for adding Kokoro as a contender in the TTS Spaces Arena.
- 📊 Thank you to everyone who contributed synthetic training data.
- ❤️ Special thanks to all compute sponsors.
- 👾 Discord server: https://discord.gg/QuGxSWBfQy
- 🪽 Kokoro is a Japanese word that translates to "heart" or "spirit". Kokoro is also a character in the Terminator franchise along with Misaki.
Excerpt shown — open the source for the full document.
Notability
notability 3.0/10Low-stars NVIDIA repo release