🐸 v0.1.0

In a nutshell, there are a ton of updates in this release. I don't know if we can cover them all here but let's try.

After this release, 🐸 TTS stands on the following architecture.

Trainer API for training.
Synthesizer API for inference.
ModelManager API for managing 🐸TTS model zoo.
SpeakerManager API for managing speakers in a multi-speaker setting.
(TBI) Exporter API for exporting models to ONNX, TorchScript, etc.
(TBI) Data Processing API for making a dataset ready for training.
Model API for implementing models, compatible with all the other components above.

Updates

💾 Code updates

Brand new Trainer API
We unified all the training code in a lightweight but feature complete Trainer API. From now on all the 🐸TTS
models will use this new API for training.
It provides mixed precision (with Nvidia's APEX of torch.amp) and multi-gpu training for all the models.
Brand new Model API
Abstract BaseModel and its BaseTTS, BaseVocoder child classes are used as the basis of the 🐸TTS models now.
Any model that implements one of these classes, works seamlessly with the Trainer and Synthesizer.
Brand new 🐸TTS recipes.
We decided to merge the recipes to the main project. Now we host recipes for the LJspeech dataset, covering all the implemented models.
So you can pick the model you want, change the parameters, and train your own model easily.
Thanks to the new Trainer API and 👩‍✈️Coqpit integration, we could implement these recipes with pure python.
Updates SpeakerManager API
TTS.utilsSpeakerManager is now the core unit to manage speakers in a multi-speaker model and interface a SpeakerEncoder model with the tts and vocoder models.
Updated model training mechanics.
You can now use pure Python to define your model and run the training. It is useful to train models on a Jupyter
Notebook or the other python environments.
We also keep the old mechanics by using TTS/bin/train_tts.py or ``TTS/bin/train_vocoder.py`. You just need to
change the previous training script name with one of these two based on your model.
```
python TTS/bin/train_tacotron.py --config_path config.json
```
becomes
```
python TTS/bin/train_tts.py --config_path config.json
```
Use 👩‍✈️Coqpit for managing model class arguments.
Now all the model arguments are defined in a coqpit class and imported by the model config.
gruut based character to phoneme conversion. (👑 @synesthesiam)
As a drop-in replacement for the previous solution that is compatible with the released models. So now all these
models are functional again without version nitpicking.
Set test_sentences in the config rather than providing a txt file.
Set the maximum number of decoder steps of Tacotron1-2 models in the config.

🏃‍♀️ Operational Updates

FINALLY DOCUMENTATION!! https://tts.readthedocs.io
Enable support for Python 3.9
Changes for PyTorch 1.9.0

🏅 Model implementations

Univnet GAN Vocoder: https://arxiv.org/pdf/2106.07889.pdf (👑 @rishikksh20)

🚀 Model releases

We solved the compat issues and re-release some of the models. You can see them in the released binaries section.

You don't need to change anything. If you use v0.1.0, by default, it uses these new models.

coqui-ai/TTS v0.1.0 on GitHub