πΈ v0.0.10
πBug Fixes
- Make
synthesizer.py
saving the output audio with the vocoder sampling rate. It is necessary if there is sampling rates of the tts and the vocoder models are different and interpolation is applied to the tts model output before running the vocoder. Practically, it fixes generated Spanish and French voices bytts
ortts-server
on the terminal. - Handling utf-8 on Windows. (by @adonispujols)
- Fix Loading the last model when
--continue_training
. It was loading the best_model regardless.
πΎ Code updates
- Breaking Change: Update default set of characters in
symbols.py
. This might require you to set your character set inconfig.json
if you like to use this version with your models trained with the previous version. - Chinese backend for text processing (#654 by @kirianguiller)
- Enable torch.hub integration for the released models.
- First github release.
- dep. version fixes. Using numpy > 1.17.5 breaks some tests.
- WaveRNN fix (by @gerazov )
- Big refactoring for the training scripts to share the init part of the code. (by @gerazov)
- Enable ModelManager to download models from Github releases.
- Add a test for
compute_statistics.py
- light-touch updates in
tts
andtts-server
entry points. (thanks @thorstenMueller ) - Define default vocoder models for each tts model in
.models.json
.tts
andtts-server
entry points use the default vocoder if the user does not specify. -
find_unique_chars.py
to find all the unique characters in a dataset. - A better way to handling best models through training. (thx @gerazov )
- pass used characters to the model config.json at the beginning of the training. This prevents any code update later to affect the trained models.
- Migration to Github Actions for CI.
- Deprecate wheel based use of tts-server for the sake of the new design.
- πΈ
πΆββοΈ Operational Updates
- Move released models to Github Releases and deprecate GDrive being the first option.
π Model implementations
- No updates π
π New Pre-Trained Model Releases
- English ek1 - Tacotron2 model and WaveGrad vocoder under
.models.json
. (huge THX!! to @nmstoker) - Russian Ruslan - Tacotron2-DDC model.
- Dutch model. (huge THX!! to @r-dh )
- Chinese Tacotron2 model. (huge THX!! to @kirianguiller)
- English LJSpeech - SpeechSpeech with WaveNet decoder.
Released Models
π‘ All the models below are available by tts
end point as explained here.
Language | Dataset | Model Name | Model Type | TTS version | Download |
---|---|---|---|---|---|
English | LJSpeech | SpeedySpeech | tts | π v0.0.10 | πΎ |
English | EK1 | Tacotron2 | tts | π v0.0.10 | πΎ |
Dutch | MAI | TacotronDDC | tts | π v0.0.10 | πΎ |
Chinese | Baker | TacotronDDC-GST | tts | π v0.0.10 | πΎ |
English | LJSpeech | TacotronDCA | tts | v0.0.9 | πΎ |
English | LJSpeech | Glow-TTS | tts | v0.0.9 | πΎ |
Spanish | M-AILabs | TacotronDDC | tts | v0.0.9 | πΎ |
French | M_AILabs | TacotronDDC | tts | v0.0.9 | πΎ |
Dutch | MAI | TacotronDDC | tts | π v0.0.10 | πΎ |
English | EK1 | WaveGrad | vocoder | π v0.0.10 | πΎ |
Dutch | MAI | ParallelWaveGAN | vocoder | π v0.0.10 | πΎ |
English | LJSpeech | MB-MelGAN | vocoder | v0.0.9 | πΎ |
π Multi-Lang | LibriTTS | FullBand-MelGAN | vocoder | v0.0.9 | πΎ |
π Multi-Lang | LibriTTS | WaveGrad | vocoder | v0.0.9 | πΎ |