diodiogod/TTS-Audio-Suite v4.16.0 on GitHub

🎙️ CosyVoice3 TTS Engine - Zero-Shot Voice Conversion!

Major new engine addition! CosyVoice3 brings powerful TTS capabilities AND zero-shot voice conversion. Previously, ChatterBox was the only zero-shot voice changer option (RVC requires training). Now you have another high-quality option with CosyVoice3 VC!

✨ New Features

CosyVoice3 Engine

Zero-shot voice conversion (VC) - Convert any voice to match another voice without training! Another option alongside ChatterBox VC
Iterative refinement cache - Improve VC quality through multiple passes
Multilingual TTS - 4 core languages (Chinese, English, Japanese, Korean) plus 5 additional languages
Three TTS modes: zero-shot, instruction, and cross-lingual voice cloning
Model variants: standard and RL-enhanced (improved quality, set as default)
Paralinguistic tag support for natural speech effects (laughter, breath, cough, etc.)
Character voice switching with [CharacterName] syntax
Language switching with [lang:code] syntax
SRT subtitle timing support for synchronized audio generation
Per-segment parameter control ([seed:42], [speed:1.5])
Live generation progress with token-by-token updates and ETA

🔧 Improvements

Fix ChatterBox and RVC model discovery to work with custom model paths (extra_model_paths.yaml)
Fix RVC audio chunking errors with short segments

📚 Documentation

New CosyVoice3 paralinguistic tags guide
Updated README with CosyVoice3 features and examples

🙏 Credits

Initial CosyVoice3 implementation by @tazztone

diodiogod/TTS-Audio-Suite v4.16.0 v4.16.0 - CosyVoice3 TTS Engine on GitHub

🎙️ CosyVoice3 TTS Engine - Zero-Shot Voice Conversion!

✨ New Features

🔧 Improvements

📚 Documentation

🙏 Credits

diodiogod/TTS-Audio-Suite v4.16.0
v4.16.0 - CosyVoice3 TTS Engine

on GitHub