github sipeter/CloneTTS v0.5.3
CloneTTS v0.5.3

latest releases: v0.6.4, v0.6.3, v0.6.2...
one month ago

This release includes all cumulative updates and fixes from the v0.5.3 beta series (beta1 – beta2).

New Features

  • Built-in Reference Audio: Automatically imports 3 preset TTS voices on first install — experience voice cloning instantly without recording.
  • Short Text Pronunciation Fix: Automatically appends a period to text fragments not ending with sentence-final punctuation, reducing abrupt cutoffs at the end of short phrases.
  • Multi-select ZIP Import: The file picker for voice import now supports multi-select mode, allowing batch import of multiple ZIP files at once.
  • Selective Overwrite Import: Overwrite import now displays a voice list for the user to choose from, instead of blindly overwriting everything.
  • Import File Validation: Format validation added when creating a new voice (audio formats only); reference audio capped at 30 seconds. ZIP imports are validated for content structure, with clear error messages for invalid files.

UI & Interaction Improvements

  • Drag-to-Reorder Overhaul: Drag-and-drop sorting for voice lists and replacement rule lists has been fully rewritten, completely eliminating edge-case jitter.
  • Smoother Scrolling: Fixed an issue where the entire Voice Tab scrolled in unison.
  • Benchmark Button Optimization: Engine initialization moved to a background thread — first tap no longer causes a UI freeze.

Bug Fixes & Core Changes

  • System TTS Playback Artifact Fix: Removed redundant runtime RNNoise denoising, eliminating the audio glitch at the beginning of playback caused by GRU initialization artifacts.
  • Dual-Engine Consolidation: Voice preview and System TTS / Benchmark now share a single engine instance, saving ~150 MB of memory.
  • "Restore Original Audio" Button Fix: Added applied-state tracking so the restore button only appears after audio has actually been processed.
  • Overwrite Import Fix: Fixed an issue where importing multiple files via overwrite would overwrite each other.

该版本包含了 v0.5.3 测试系列(beta1 - beta2)的累计更新内容与修复。

新增功能

  • 内置参考音频:首次安装自动导入 3 个 TTS 预设音色,无需录音即可立即体验语音克隆效果。
  • 短文本发音优化:对未以句末标点结尾的文本片段,送入模型前自动追加句号,改善短句末尾截断或发音急促的问题。
  • 多选 ZIP 导入:音色导入文件选择器支持多选模式,一次可选多个 ZIP 文件批量导入。
  • 覆盖导入支持选择性导入:覆盖导入现在也会先展示音色列表供用户选择,而非直接全量覆盖。
  • 导入文件格式校验:新建音色时增加格式校验(仅允许音频格式),参考音频时长上限 30 秒;ZIP 包导入时校验内容结构,无效文件直接提示。

交互与界面优化

  • 拖拽排序重构:音色列表和替换规则列表的拖拽排序全面重构,彻底消除了边缘抖动问题。
  • 滚动流畅度提升:修复 Voice Tab 整页联动滚动问题。
  • 性能测试按钮优化:引擎初始化移入后台线程,首次点击不再卡顿。

Bug 与核心修复

  • 系统 TTS 朗读开头异声修复:移除运行时重复执行的 RNNoise 降噪处理,消除 GRU 初始化伪影导致的开头杂音。
  • 双引擎架构合并:音色试听与系统 TTS/性能测试共享同一引擎实例,节省约 150MB 内存占用。
  • 「恢复原始音频」按钮逻辑修复:新增已应用状态追踪,仅在音频实际经过加工后才显示恢复按钮。
  • 覆盖导入修复:修复多文件覆盖导入互相覆盖的问题。

Don't miss a new CloneTTS release

NewReleases is sending notifications on new releases.