-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
52624eb
commit 7b1daee
Showing
10 changed files
with
244 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
name: Documentation | ||
on: | ||
push: | ||
branches: | ||
- main | ||
paths: | ||
- 'docs/**' | ||
- 'mkdocs.yml' | ||
- '.github/workflows/docs.yml' | ||
- 'requirements_docs.txt' | ||
|
||
permissions: | ||
contents: write | ||
|
||
jobs: | ||
deploy: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v4 | ||
|
||
- name: Set up Python | ||
uses: actions/setup-python@v4 | ||
with: | ||
python-version: '3.10' | ||
|
||
- name: Install dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install -r requirements_docs.txt | ||
- name: Deploy documentation | ||
run: mkdocs gh-deploy --force |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
# SimpleTTS | ||
|
||
A lightweight Python library for running TTS models with a unified API. | ||
|
||
!!! warning | ||
|
||
This project is under active development and APIs may change. Not recommended for production use yet. | ||
|
||
## Features | ||
|
||
- 🚀 Simple and intuitive API - get started in minutes | ||
- 🔄 No model lock-in - switch models with just a few lines of code | ||
- 🎯 Focus on ease of use - a single API for all models | ||
- 📦 Minimal dependencies - one package for all models | ||
- 🔌 Extensible architecture - easily add new models | ||
|
||
## Installation | ||
|
||
Install the latest release from PyPI: | ||
|
||
```bash | ||
pip install simpletts | ||
``` | ||
|
||
Or get the latest version from source: | ||
|
||
```bash | ||
pip install git+https://github.com/fakerybakery/simpletts | ||
``` | ||
|
||
## Quick Start | ||
|
||
```python | ||
from simpletts.models.xtts import XTTS | ||
import soundfile as sf | ||
|
||
tts = XTTS(device="auto") | ||
# Note: XTTS is licensed under the CPML license which restricts commercial use. | ||
|
||
array, sr = tts.synthesize("Hello, world!", ref="sample.wav") | ||
|
||
sf.write("output.wav", array, sr) | ||
``` | ||
|
||
## Support & Feedback | ||
|
||
If you encounter any issues or have questions, please open an [issue](https://github.com/fakerybakery/simpletts/issues). | ||
|
||
## License | ||
|
||
This project is licensed under the **BSD-3-Clause** license. See the [LICENSE](LICENSE) file for more details. | ||
|
||
While SimpleTTS itself is open source and can be used commercially, please note that some supported models have different licensing terms: | ||
|
||
- XTTS is licensed under CPML which restricts commercial use | ||
- Kokoro is licensed under Apache-2.0 which allows commercial use | ||
- Other models may have their own licensing requirements | ||
|
||
Note that SimpleTTS **does not** use the GPL-licensed `phonemizer` library. Instead, it uses the BSD-licensed `openphonemizer` alternative. While this may slightly reduce pronunciation accuracy, it's license is compatible with the BSD-3-Clause license of SimpleTTS. | ||
|
||
For complete licensing information for all included models and dependencies, please see the `licenses` directory. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
# Models | ||
|
||
## Supported Models | ||
|
||
| Model | License | Description | | ||
|-------|---------|-------------| | ||
| XTTS | CPML | High-quality multilingual TTS with voice cloning capabilities | | ||
| Kokoro | Apache-2.0 | Fast and lightweight English TTS with voice cloning | | ||
| F5-TTS | CC BY-NC | Superb voice cloning and naturalness, but slower and less stable | | ||
| Parler TTS | Apache-2.0 | Describe a voice with a text prompt | | ||
|
||
## XTTS | ||
|
||
```python | ||
from simpletts.models.xtts import XTTS | ||
import soundfile as sf | ||
|
||
# Initialize XTTS model | ||
tts = XTTS(device="auto") | ||
|
||
# Synthesize speech | ||
text = "Hello world! This is a test of the XTTS text-to-speech system." | ||
audio, sr = tts.synthesize(text, ref="sample.wav", language="en") | ||
|
||
# Save output audio | ||
sf.write("output.wav", audio, sr) | ||
``` | ||
|
||
## Kokoro | ||
|
||
!!! note | ||
|
||
Currently, only English is supported through SimpleTTS. The Kokoro model itself supports multiple languages. | ||
|
||
```python | ||
from simpletts.models.kokoro import Kokoro | ||
import soundfile as sf | ||
|
||
# Initialize Kokoro model | ||
tts = Kokoro(device="auto") | ||
|
||
# Synthesize speech | ||
text = "Hello world! This is a test of the Kokoro text-to-speech system." | ||
audio, sr = tts.synthesize(text, ref="af") | ||
|
||
# Save output audio | ||
sf.write("output.wav", audio, sr) | ||
``` | ||
|
||
## F5-TTS | ||
|
||
```python | ||
from simpletts.models.f5 import F5 | ||
import soundfile as sf | ||
|
||
# Initialize F5 model | ||
tts = F5(device="auto") | ||
|
||
# Synthesize speech | ||
text = "Hello world! This is a test of the F5 text-to-speech system." | ||
audio, sr = tts.synthesize(text, ref="sample.wav") | ||
|
||
# Save output audio | ||
sf.write("output.wav", audio, sr) | ||
``` | ||
|
||
## Parler TTS | ||
|
||
!!! note | ||
|
||
If you are trying to install Parler TTS, you may run into dependency conflicts or other issues. Parler TTS is not officially supported by the SimpleTTS project, please do not report issues to the SimpleTTS project if you run into issues. | ||
|
||
Parler TTS is not officially available on PyPI, so we cannot add it as a required dependency due to PyPI security requirements. We have published several unofficial packages for Parler TTS and its dependencies to PyPI, however this is not guaranteed to work. | ||
|
||
If you run into issues, please try running `pip uninstall parler-tts` and then `pip install git+https://github.com/huggingface/parler-tts`. | ||
|
||
```python | ||
from simpletts.models.parler import Parler | ||
import soundfile as sf | ||
|
||
# Initialize Parler model | ||
tts = Parler(device="auto") | ||
|
||
# Synthesize speech | ||
text = "Hello world! This is a test of the Parler text-to-speech system." | ||
audio, sr = tts.synthesize(text, ref="A female speaker delivers a slightly expressive and animated speech with a moderate speed and pitch. The recording is of very high quality, with the speaker's voice sounding clear and very close up.") | ||
|
||
# Save output audio | ||
sf.write("output.wav", audio, sr) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
# Roadmap | ||
|
||
## Models | ||
|
||
- [x] XTTS - Production-ready multilingual TTS | ||
- [x] Kokoro - StyleTTS 2-based English TTS without voice cloning | ||
- [x] F5-TTS - Superb voice cloning and naturalness, but slower and less stable | ||
- [x] Parler TTS - Describe a voice with a text prompt | ||
- [ ] StyleTTS 2 - Fast and efficient zero-shot voice cloning | ||
- [ ] CosyVoice2 - Zero-shot voice cloning | ||
- [ ] MetaVoice - 1.1B parameter zero-shot voice cloning model | ||
- [ ] Fish Speech 1.5 - Zero-shot voice cloning | ||
- [ ] OpenVoice V2 - Open source zero-shot voice cloning by MyShell | ||
|
||
## Features | ||
|
||
- [x] Simple Python API for easy integration | ||
- [ ] Command-line interface for quick testing and batch processing | ||
- [ ] REST API and web interface for remote access | ||
- [ ] Model benchmarking tools | ||
- [ ] Batch processing support | ||
- [ ] Audio post-processing options | ||
- [ ] Allow easier extensibility with a plugin system |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
site_name: SimpleTTS | ||
theme: | ||
name: material | ||
palette: | ||
scheme: slate | ||
primary: black | ||
markdown_extensions: | ||
- admonition | ||
- pymdownx.details | ||
- pymdownx.superfences | ||
- def_list | ||
- pymdownx.tasklist: | ||
custom_checkbox: true | ||
extra: | ||
generator: false |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
mkdocs | ||
pymdown-extensions | ||
mkdocs-material |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# Welcome to MkDocs | ||
|
||
For full documentation visit [mkdocs.org](https://www.mkdocs.org). | ||
|
||
## Commands | ||
|
||
* `mkdocs new [dir-name]` - Create a new project. | ||
* `mkdocs serve` - Start the live-reloading docs server. | ||
* `mkdocs build` - Build the documentation site. | ||
* `mkdocs -h` - Print help message and exit. | ||
|
||
## Project layout | ||
|
||
mkdocs.yml # The configuration file. | ||
docs/ | ||
index.md # The documentation homepage. | ||
... # Other markdown pages, images and other files. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
site_name: My Docs |