๐ธ v0.2.0
๐Bug Fixes
- Fix phoneme pre-compute issue.
- Fix multi-speaker setup in Tacotron models.
- Fix small issues in the Trainer regarding multi-optimizer training.
๐พ Code updates
- W&B integration for model logging and experiment tracking, (๐ @AyushExel)
Code uses the Tensorboard by default. For W&B, you need to setlog_dashboard
option in the config and defineproject_name
andwandb_entity
. - Use ffsspec for model saving/loading (๐ @agrinh)
- Allow models to define their own symbol list with in-class
make_symbols()
- Allow choosing after epoch or after step LR scheduler update with
scheduler_after_epoch
. - Make converting spectrogram from amplitude to DB optional with
do_amp_to_db_linear
anddo_amp_to_db_linear
options.
๐๏ธ Docs updates
- Add GlowTTS and VITS docs.
๐ค Model implementations
- VITS implementation with pre-trained models (https://arxiv.org/abs/2106.06103)
๐ Model releases
-
vocoder_models--ja--kokoro--hifigan_v1 (๐ @kaiidams)
HiFiGAN model trained on Kokoro dataset to complement the existing Japanese model.
Try it out:
tts --model_name tts_models/ja/kokoro/tacotron2-DDC --text "ใใใซใกใฏใไปๆฅใฏใใๅคฉๆฐใงใใ๏ผ"
-
tts_models--en--ljspeech--tacotronDDC_ph
TacotronDDC with phonemes trained on LJSpeech. It is to fix the pronunciation errors caused by the raw text
in the released TacotronDDC model.Try it out:
tts --model_name tts_models/en/ljspeech/tacotronDDC_ph --text "hello, how are you today?"
-
tts_models--en--ljspeech--vits
VITS model trained on LJSpeech.
Try it out:
tts --model_name tts_models/en/ljspeech/vits --text "hello, how are you today?"
-
tts_models--en--vctk--vits
VITS model trained on VCTK with multi-speaker support.
Try it out:
tts-server --model_name tts_models/en/vctk/vits
-
vocoder_models--en--ljspeech--univnet
UnivNet model trained on LJSpeech to complement the TacotronDDC model above.
Try it out:
tts --model_name tts_models/en/ljspeech/tacotronDDC_ph --text "hello, how are you today?"