πΈ v0.0.13
πBug Fixes
πΎ Code updates
SpeakerManager
class for handling multi-speaker model management and interfacingspeaker.json
file.- Enabling multi-speaker models with
tts
andtts-server
endpoints. (π @kirianguiller ) - Allow choosing a different
noise scale
for GlowTTS at inference. - Glow-TTS updates to import SC-Glow Models.
- Fixing windows support (π @WeberJulian )
πΆββοΈ Operational Updates
- Refactoring πΈ TTS installation and allow selecting different scopes (
all, tf, notebooks
)for installation depending on the specific needs.
π Model implementations
π New Pre-Trained Model Releases
- SC-GlowTTS multi-speaker English model from our work https://arxiv.org/abs/2104.05557 (π @Edresson )
- HiFiGAN vocoder finetuned for the above model.
- Tacotron DDC Non-Binary English model using Accenture's Sam dataset.
- HiFiGAN vocoder trained for the models above.
Released Models
π‘ All the models below are available by tts
or tts-server
endpoints on CLI as explained here.
Models with β¨οΈ below are new with this release.
- SC-GlowTTS model is from our latest paper in a collaboration with @Edresson and @mueller91.
- The new non-binary TTS model is trained using the SAM dataset from Accenture Labs. Check out their blog post
Language | Dataset | Model Name | Model Type | TTS version | Download |
---|---|---|---|---|---|
β¨ English (non-binary) | sam (acccenture) | Tacotron2-DDC | tts | π v0.0.13 | πΎ |
β¨ English (multi-speaker) | VCTK | SC-GlowTTS | tts | π v0.0.13 | πΎ |
English | LJSpeech | Tacotron-DDC | tts | v0.0.12 | πΎ |
German | Thorsten-DE | Tacotron-DCA | tts | v0.0.11 | πΎ |
German | Thorsten-DE | Wavegrad | vocoder | v0.0.11 | πΎ |
English | LJSpeech | SpeedySpeech | tts | v0.0.10 | πΎ |
English | EK1 | Tacotron2 | tts | v0.0.10 | πΎ |
Dutch | MAI | TacotronDDC | tts | v0.0.10 | πΎ |
Chinese | Baker | TacotronDDC-GST | tts | v0.0.10 | πΎ |
English | LJSpeech | TacotronDCA | tts | v0.0.9 | πΎ |
English | LJSpeech | Glow-TTS | tts | v0.0.9 | πΎ |
Spanish | M-AILabs | TacotronDDC | tts | v0.0.9 | πΎ |
French | M_AILabs | TacotronDDC | tts | v0.0.9 | πΎ |
Dutch | MAI | TacotronDDC | tts | v0.0.10 | πΎ |
β¨ English | sam (accenture) | HiFiGAN | vocoder | π v0.0.13 | πΎ |
β¨ English | VCTK | HiFiGAN | vocoder | π v0.0.13 | πΎ |
English | LJSpeech | HiFiGAN | vocoder | v0.0.12 | πΎ |
English | EK1 | WaveGrad | vocoder | v0.0.10 | πΎ |
Dutch | MAI | ParallelWaveGAN | vocoder | v0.0.10 | πΎ |
English | LJSpeech | MB-MelGAN | vocoder | v0.0.9 | πΎ |
π Multi-Lang | LibriTTS | FullBand-MelGAN | vocoder | v0.0.9 | πΎ |
π Multi-Lang | LibriTTS | WaveGrad | vocoder | v0.0.9 | πΎ |
Update Jun 7 2021: Ruslan (Russian) model has been removed due to the license conflict.