github tgeczy/TGSpeechBox v-282
TG SpeechBox with phoneme editor and NVDA Addon, Speech Dispatcher module version 2.82

latest releases: v-300, v-300rc2, v-300rc1...
one month ago

A proper bugfix release!

TGSpeechBox v2.85 Release Notes

Overview

v2.82 is a quality-focused bugfix and tuning release. The major themes are restoring en-gb (British English) to its original quality after accumulated normalization rule drift, improving voiceless stop handling across both English accents, and expanding the allophone rule engine with new matching capabilities. The NVDA driver settings panel has also been significantly streamlined.


en-gb Quality Restoration

The British English voice had regressed from its original Python Speech Player quality due to accumulated normalization rules that were never needed. This release strips them back:

  • Removed the ɔː → ᴏː normalization rule that was shifting F1 by +160 Hz on the THOUGHT/FORCE vowel, making words like "broadcast" sound American instead of RP.
  • Removed all four ˈiə → ˈɪə (NEAR vowel) normalization variants. The original /i/+/ə/ diphthong had a natural 620 Hz F2 drop; the replacement /ɪ/+/ə/ only dropped 400 Hz, creating an audible two-beat "zeheeero" artifact on words like "zero" and "weird."
  • Added ʊ̞ → ʊ preReplacement in en-gb to undo the shared en.yaml ʊ → ʊ̞ rule, which was pushing the GOAT diphthong's offglide F2 +120 Hz too front for RP.
  • Set singleWordFinalHoldMs: 0 to eliminate the weird curvy intonation on single-letter utterances like "A" and "O." The original Python engine had zero special single-word handling.
  • Updated word-final NEAR collapse rules to match the restored /iə/ base form.

M-to-P Nasal Pop Fix (en-gb)

Fixed an audible pop between /m/ and /p/ in phrases like "Sample rate?" with question intonation. The root cause was stopClosureAfterNasalsEnabled: false in en-gb, which meant the nasal transitioned directly into the stop with a one-sample amplitude snap causing a waveform discontinuity. Question intonation made it worse because rising pitch concentrates more energy at the cutoff moment. Fix: set stopClosureAfterNasalsEnabled: true in en-gb.yaml (en-us already had this enabled).

Aspirated Voiceless Stops

Word-final voiceless stops (/p/, /t/, /k/) now have per-accent aspiration behavior:

  • en-us: clauseFinalFadeMs: 25 appends a fade token that produces a natural breathy release tail — the "blankʰ" sound.
  • en-gb: clauseFinalFadeMs: 0 clips the stop off cleanly, which is authentic for RP — the clipped "blank" sound.
  • Voiceless stop aspirationAmplitude values in phonemes.yaml have been tuned (/p/: 0.20, /t/: 0.25, /k/: 0.25) and aspiration release time increased from 3ms to 12ms so the burst has time to decay audibly after the stop release.

Word-Final Stop Spectral Shaping

The unreleased_word_final allophone rule now applies spectral shaping via parallel amplitude scaling to make word-final voiceless stops sound softer and airier rather than hard and clicky:

  • Cuts low-frequency burst energy (pa1: 0.3, pa2: 0.5) to remove the thump.
  • Boosts high-frequency energy (pa4: 1.3, pa5: 1.4) for an airy quality.
  • Works within the existing time budget — no extra tokens or added gaps.
  • en-gb uses slightly more aggressive values for tighter RP clipping.

Neighbor Flag Filters for Allophone Rules

Added four new match conditions to the allophone rule engine: beforeFlags, notBeforeFlags, afterFlags, and notAfterFlags. These filter rules based on the phoneme class flags of neighboring phonemes rather than requiring explicit phoneme key lists.

This was discovered as a need when word-final aspirated stop rules were incorrectly firing in cross-word-boundary contexts like "back up" (where /k/ should keep full aspiration into the following vowel). The solution: notBeforeFlags: [vowel] — clean, language-independent, and doesn't require enumerating every vowel symbol.

The vowel phoneme class flag, along with the other existing flags (stop, nasal, liquid, semivowel, affricate, tap, trill, voiced), is now usable in neighbor context matching — a significant expansion of what allophone rules can express purely through YAML configuration.

Changes across 4 files: pack.h (struct fields), pack.cpp (YAML parsing), allophones.cpp (rule matching step 8), and language pack YAML files.

Radiation Mix Baseline Fix

Fixed a bug where radiationMixBaseline at 0.0 was completely suppressing the lip radiation filter, making the voice sound unnaturally dark/muffled when spectral tilt was at zero. Baseline now defaults to 0.15, restoring the natural brightness that lip radiation provides.

cascadeBwScale Tuning

Set cascadeBwScale: 0.9 globally, tightening formant bandwidth peaks by 10%. This sharpens vowel quality and improves formant distinction without introducing resonance instability.

Boundary Smoothing — YAML-Configurable Fade Times

All 13 boundary smoothing fade time parameters are now exposed as YAML settings under the boundarySmoothing: block, allowing per-language-pack tuning of transition durations between phoneme classes:

vowelToStopFadeMs, stopToVowelFadeMs, vowelToFricFadeMs, fricToVowelFadeMs, vowelToNasalFadeMs, nasalToVowelFadeMs, vowelToLiquidFadeMs, liquidToVowelFadeMs, nasalToStopFadeMs, liquidToStopFadeMs, fricToStopFadeMs, stopToFricFadeMs, vowelToVowelFadeMs

These were previously hardcoded in the C++ boundary smoothing pass.

NVDA Driver Settings Panel Cleanup

Removed 71 _addQuickTextField() calls from the NVDA settings panel (~600 lines of code deleted). The quick-edit fields were cluttering the UI and were redundant with the existing combo box approach.

  • The combo box is now alphabetized for easier navigation.
  • Renamed "Other settings:" label to "Edit setting:" for clarity.
  • Added all new v2.85 settings to the _extraKeys list so they appear in the combo box: boundary smoothing dotted keys, clauseFinalFadeMs, stopClosureAfterNasalsEnabled, and allophone-related toggles.
  • Removed supporting infrastructure: _addQuickTextField(), _onQuickValueChanged(), _updateQuickDisplays(), and the self._quickCtrls dictionary.

Legacy Pitch Inflection Slider

Added a user-facing pitch inflection control to the NVDA driver panel, giving users direct access to the legacy pitch model's inflection amount without needing the combo box.


Don't miss a new TGSpeechBox release

NewReleases is sending notifications on new releases.