TTS Audio Suite v4.15.0

🎉 Major New Features

A powerful new AI-powered text-to-speech engine with zero-shot voice cloning:

Transform any TTS engine's output with AI-powered audio editing (post-processing):

14 emotions: happy, sad, angry, surprised, fearful, disgusted, contempt, neutral, etc.
32 speaking styles: whisper, serious, child, elderly, neutral, and more
Speed control: make speech faster or slower
10 paralinguistic effects: laughter, breathing, sigh, gasp, crying, sniff, cough, yawn, scream, moan
Audio cleanup: denoise and voice activity detection
Universal compatibility: Works with audio from ANY TTS engine (ChatterBox, F5-TTS, Higgs Audio, VibeVoice)

Add audio effects directly in your text across all TTS engines:

Easy syntax: "Hello <Laughter> this is amazing!"
Works everywhere: Compatible with all TTS engines using Step Audio EditX post-processing
Multiple tag types: <emotion>, <style>, <speed>, and paralinguistic effects
Control intensity: <Laughter:2> for stronger effect, <Laughter:3> for maximum
Voice restoration: <restore> tag to return to original voice after edits
📖 Read the complete Inline Edit Tags guide