tgeczy/TGSpeechBox v-300b5 on GitHub

TGSpeechBox v3.0-beta5

Changes since v3.0-beta4. 66 commits!

⚠️ Last build before CSUN 2026! The team will be at the conference next week — no new builds or updates for about a week. We'll be back at it after CSUN. If you're there, come say hi!

New Features

Windows 7 compatibility (issue #45): All Windows binaries (SAPI, NVDA DLLs, phoneme editor, settings app) now run on Windows 7. Switched to static CRT and integrated YY-Thunks v1.1.9 which provides fallback implementations for Win8+ APIs the CRT uses internally. Binaries are fully self-contained — no MSVC redistributable needed. Thank you @aksel for reporting this.
Head Size slider: New vocal tract length control with graduated formant scaling (F1×s^0.2 through F6×s^1.0) for realistic head-size variation. Available on NVDA, SAPI, iOS, and Android. David voice defaults to headSize 100 (large pharynx).
Voice profiles on mobile: Beth (female) and Bobby (child) voices now available on iOS and Android, with dynamic discovery from phonemes.yaml.
Pack settings editor on mobile: New Editor tab on iOS and Android lets you browse and override any language pack setting with live reload. Changes are stored as overrides and auto-removed when they match the pack default.
Year splitting: 4-digit numbers are split into two 2-digit pairs ("1995" → "nineteen ninety-five"), with "oh" for leading-zero pairs ("1906" → "nineteen oh six"). Handles edge cases: X000 numbers go to eSpeak ("4000" → "four thousand"), 200X numbers stay whole. NVDA checkbox to toggle. Enabled for all English dialects.
English date ordinals: "June 6" → "June 6th", "March 1" → "March 1st" — ordinal suffixes added before eSpeak phonemization. Wired through SAPI, NVDA, and mobile.
Sonorant-context vowel protection: Unstressed vowels flanked by nasals, liquids, or semivowels get extra duration floor (8ms) and amplitude boost (1.15×). Fixes "animals" losing its middle syllable at rate 15+. Cross-linguistic — all 26 languages benefit.
Cross-word allophone conditions: New rule conditions that look across word boundaries. Used to fix Spanish word-final /ɾ/ trilling before vowels (issue #42).
en-gb stress dictionary: 2,005 entries derived from Reece H. Dunn's RP dictionary (BSD 2-clause).
Vocal tract shape parameters: Three new VoicingTone fields — nasalBwScale, f4FreqScale, nasalGainScale — for finer voice character control.

Join Test Flight for Mac OS and iOS!

Join TestFlight here by clicking this link from your mobile device.

Bug Fixes

SAPI clipping at 100% volume: Output gain boost was 1.95× at max volume — now 1.50×. Previously needed to set volume to 85% to avoid clipping.
en-us.yaml indent broke parser: sonorantContextAmplitudeScale had wrong indentation, making everything after it invisible to the YAML parser. Result: half-British sounding US English.
Word-final stop sounded like affricate (issue from Dane Stange): pa6 was untouched by the unreleased_word_final rule, leaving it as the dominant burst component with upward spectral tilt. /t/ sounded like /t͡ʃ/ ("eight" → "eitch"). Added pa6 rolloff to all 3 English dialects.
Spanish word-final /ɾ/ double trill (issue #42): /ɾ/ before vowels across word boundaries was incorrectly trilled. Fixed with cross-word allophone conditions.
iOS VoiceOver settings reset on restart: Multiple fixes — AU extension applied settings before voice/language (wiping them), host app wasn't flushing UserDefaults to disk, pitch mode lost on restart.
Voice profile reset on language change: pack reload wiped voiceProfileName. Fixed in C frontend and NVDA Python driver.
Function-word reduction broken: "for" and "or" were promoted like content words because the wordShape builder omitted length marks, breaking the entire long-vowel exclusion list.
Year splitting edge cases: X000 numbers ("4000" → "forty oh zero"), 200X numbers, and 0X second pairs all had incorrect splitting behavior.
44 kHz aspiration thinness: White noise spectral density spreads thinner at higher sample rates. Added fourth-root amplitude scaling for aspiration noise.
Diphthong micro-frame buffer overflow: Fixed buffer overrun and scaled micro-frame count with duration.
TTS service crash on Android: Infinite recursion in applyCurrentVoice.
Low-pitch choppiness: Fixed, along with David voice formant compression issue.
Strip PUA-A codepoints: Stray Private Use Area codepoints in IPA input no longer corrupt allophone processing.

Language Pack Improvements

GenAm vowel formant tuning: Conservative 50–70% moves toward Hillenbrand (1995) male speaker targets for /ɪ/, /æ/, /ɔ/, and /o/.
/d/ frication reduced and /ɹ/ F3 lowered: Clearer stop quality and warmer R sound.
Tap /ɾ/ frication softened: Reduced gritty quality in common words like "accessibility".
RP MOUTH onset retuned and US THOUGHT voiceAmplitude adjusted.
Hungarian word-final /t/ frication reduced: Was sounding like "cs".

Platform Improvements

NVDA: Year splitting checkbox in settings, pitch mode added to settings ring.
Android: Engine Settings voice picker decoupled from TalkBack active voice. Tab subtitles added.
iOS: Dynamic voice profile discovery, editor auto-start, VoiceOver fixes, tab subtitles.
All platforms: Cascade BW scale slider 50 now maps to 1.0 (was 0.9).

tgeczy/TGSpeechBox v-300b5 TG SpeechBox with phoneme editor, NVDA Addon, SAPI5, Linux, Android, iOS, Mac OS, version 3.0 public beta 5 on GitHub

New Features

Join Test Flight for Mac OS and iOS!

Bug Fixes

Language Pack Improvements

Platform Improvements

tgeczy/TGSpeechBox v-300b5
TG SpeechBox with phoneme editor, NVDA Addon, SAPI5, Linux, Android, iOS, Mac OS, version 3.0 public beta 5

on GitHub