github diodiogod/TTS-Audio-Suite v4.16.0
v4.16.0 - CosyVoice3 TTS Engine

latest releases: v4.27.0, v4.25.1, v4.22.0...
5 months ago

🎙️ CosyVoice3 TTS Engine - Zero-Shot Voice Conversion!

Major new engine addition! CosyVoice3 brings powerful TTS capabilities AND zero-shot voice conversion. Previously, ChatterBox was the only zero-shot voice changer option (RVC requires training). Now you have another high-quality option with CosyVoice3 VC!

✨ New Features

CosyVoice3 Engine

  • Zero-shot voice conversion (VC) - Convert any voice to match another voice without training! Another option alongside ChatterBox VC
  • Iterative refinement cache - Improve VC quality through multiple passes
  • Multilingual TTS - 4 core languages (Chinese, English, Japanese, Korean) plus 5 additional languages
  • Three TTS modes: zero-shot, instruction, and cross-lingual voice cloning
  • Model variants: standard and RL-enhanced (improved quality, set as default)
  • Paralinguistic tag support for natural speech effects (laughter, breath, cough, etc.)
  • Character voice switching with [CharacterName] syntax
  • Language switching with [lang:code] syntax
  • SRT subtitle timing support for synchronized audio generation
  • Per-segment parameter control ([seed:42], [speed:1.5])
  • Live generation progress with token-by-token updates and ETA

🔧 Improvements

  • Fix ChatterBox and RVC model discovery to work with custom model paths (extra_model_paths.yaml)
  • Fix RVC audio chunking errors with short segments

📚 Documentation

  • New CosyVoice3 paralinguistic tags guide
  • Updated README with CosyVoice3 features and examples

🙏 Credits

Initial CosyVoice3 implementation by @tazztone

Don't miss a new TTS-Audio-Suite release

NewReleases is sending notifications on new releases.