tgeczy/TGSpeechBox v-300b1 on GitHub

TGSpeechBox v3.0 Beta 1

This is the first public beta of the v3.0 release. It introduces coda noise continuity for more natural consonant cluster transitions, a rewritten NVDA driver, Android APK distribution, and significant language tuning across English and German packs.

New Features

Coda Noise Taper

Word-final fricative→stop clusters like /st/ in "list", /sk/ in "risk", and /sp/ in "lisp" now use a two-phase source crossfade during the closure gap instead of true silence. The fricative's noise energy tapers through the parallel path while aspiration rises through the cascade path, priming the resonators for the stop's burst. This eliminates the hard segmental boundary that made coda clusters sound disconnected, and gives stop releases proper formant structure from the first millisecond.

Six per-language-pack settings control the taper behavior:

codaNoiseTaperEnabled — master enable (default: true)
codaNoiseTaperPreGain — keeps both signal paths active during closure (default: 0.40)
codaNoiseTaperEarlyFricScale — frication level in the sibilant tail phase, as a fraction of the preceding fricative (default: 0.45)
codaNoiseTaperEarlyAspAmp — aspiration feed into cascade during early phase (default: 0.04)
codaNoiseTaperLateFricScale — frication level in the aspirated transition phase (default: 0.08)
codaNoiseTaperLateAspAmp — aspiration level in late phase, cascade-dominant by this point (default: 0.22)

Place-Typed Allophone Rules

Allophone rules in language packs can now specify consonant place of articulation as a matching condition. Four place types are supported:

labial — /p b m f v w ʍ ɸ β/
alveolar — /t d n s z l r ɹ ɾ θ ð ɬ ɮ ɻ ɖ ʈ ɳ ɽ/
palatal — /ʃ ʒ t͡ʃ d͡ʒ j ɲ ç ʝ c ɟ ʎ/
velar — /k g ŋ x ɣ ɰ/

The phoneme editor now surfaces these place assignments and validates rules against them — if you assign a phoneme to the wrong place type, the editor will flag it.

NVDA Driver Rewrite

The NVDA synthesizer driver has been rewritten. It now calls into eSpeak's library directly for phoneme conversion rather than relying on NVDA's internal eSpeak bindings. This gives TGSpeechBox full control over the phoneme pipeline and removes a longstanding coupling to NVDA internals.

The driver is also significantly more modular — the monolithic ~100 KB __init__.py has been split into smaller, focused modules.

Breaking change: NVDA versions 2023.1 through 2023.4 are no longer supported. NVDA 2024.1 and above remain the supported versions.

Android APK

Android APK builds are now bundled in GitHub releases for the first time. This is the beginning of the v3.0 mobile expansion roadmap.

iOS and macOS Builds

iOS and macOS builds are available for public testing via TestFlight:

https://testflight.apple.com/join/jvvGY6Fz

Language Tuning

English — US (en-us)

Coarticulation pass enabled, resolving clipping on /aɪ/ diphthong onsets in words like "five", "dialog", and "filesystem"
Coda noise taper tuning for word-final stop clusters
Continued prominence pass refinements

English — GB (en-gb)

General tuning pass across vowel inventory and consonant timing

English — AU (en-au)

New dialect pack added — this is an early version and will need significant further tuning in upcoming betas
KIT vowel tuning for Australian English raised/centralized realization

German (de)

Language tuning improvements

tgeczy/TGSpeechBox v-300b1 TG SpeechBox with phoneme editor, NVDA Addon, SAPI5, Linux, Android, iOS, Mac OS, version 3.0 public beta 1 on GitHub