tgeczy/TGSpeechBox v-310b3 on GitHub

TGSpeechBox v3.10 Beta 3 — fast-rate /ɡ/ intelligibility (DSP v9 FricationTilt)

Ships the Bug 1 fix from #95 two-bug analysis: a rate-adaptive frication
spectral tilt that prevents /ɡ/ bursts from drifting into alveolar-tap
territory at fast speech rates. 29-Bloo's "Pegue → Pere" report on #95
was the diagnostic gift that isolated this; Tomi's ear-testing across
~100 rendered variants confirmed the precise spectral region and
magnitude that produces the fix.

The fix in one paragraph

At speeds > 1.2x normal, the /ɡ_es/ burst's high-frequency parallel
amplitudes (pa5 at 3750 Hz, pa6 at 4900 Hz) stay proportionally prominent
while the burst body time-compresses. That creates an upward spectral
tilt native ears read as tap [ɾ] or [d]. A new fricationTiltDb FrameEx
field scales pa4/pa5/pa6 by a frequency-dependent amount (pivot 1500 Hz,
asymmetric rolloff above pivot only — preserves pa1/pa2/pa3 entirely).
The ipa_engine applies rate-modulated negative tilt during stop burst
emission: 0 dB at speed ≤1.2, -1 dB at 1.3, -5 dB at 1.7, -8 dB at 2.0.
Preserves the velar F3 signature that makes /ɡ/ sound like /ɡ/ — only
removes the high-frequency click that was causing the tap misperception.

Backed by

Kingston et al. 2008 (Journal of Phonetics) — perceptual integration
of low-frequency energy across stop boundaries
Smits et al. 1996 (JASA) — burst dominates place-of-articulation
perception in front-vowel contexts (why "pegue" was most vulnerable)
Hualde et al. 2011 (Laboratory Phonology) — velars intrinsically less
differentiable than labials/coronals in Spanish

Changes in this beta

DSP engine (v9)

New FrameEx field fricationTiltDb (replaces unused caN0 field
from v9's initial addition). Net-zero struct size change vs b201.
Consumer in formantGenerator.h parallel path: scales each pa_i by
10^(tiltDb * max(0, pf_i - 1500Hz) / (20 * 3000Hz)). When tiltDb=0
(the legacy/normal-speed case), fast-path returns 1.0 without pow().
Rate modulator in frame_emit.cpp burst emission: applies tilt only
during burst + decay frames, restores to 0 before next phoneme.

Per-phoneme knobs

fricationTiltDb and closureGapMs (b2's closure decoupling override)
are now exposed in all three phoneme editors:

Win32 phoneme editor: both fields available in the modification
dialog with descriptive labels.
Android phoneme editor: both fields in the phoneme-field list
with proper slider ranges (tilt: -15..+15 dB, closure: 0..60 ms).
iOS phoneme editor: same.

Testers who want to tune these on a per-phoneme basis (e.g. experiment
with different fricationTiltDb baselines on /s/, /ʃ/) can now do so
directly from the editor on their platform of choice.

Cleanup

Removed unused caN0 FrameEx field (DSP v9 infrastructure from
v3.10b1.1 that was never consumed by any shipped phoneme, and had a
documented FIR HF-boost gotcha). The /l_es/ lateral antiresonance
work used the existing caNP path instead. This is the cleanest moment
to prune — swapped in-place with fricationTiltDb, no user-visible
behavior change since no phoneme ever set caN0 to non-zero.

What this beta does NOT address

Bug 2 (cluster /l/+/ɣ/ tap percept): improved from b201, not
fully eliminated. Different root cause (schwa+closure temporal
template mimicking tap articulation). Not scoped for b3; may
improve incidentally from the tilt work, will revisit in b4 if
still perceptible after community testing.
dialogo /o-ɡ-o/ at normal rate: inherent phonetic difficulty
per Kingston 2008 (low-frequency energy integration across stop
boundaries with low-F1 vowels flanking). Will likely improve at
fast rates from this fix; normal-rate remains hard.
Other items on the 3.10 roadmap (gender-aware number expansion #90,
currency #83, Android engine-tab #97, emoji translations #96) are
not in this beta.

Please ear-test

Native Spanish speakers, especially @gregodejesus2, @29-Bloo,
@rmcpantoja, @yaresDg, @dgomez42 — test on Windows (NVDA or SAPI),
Android (APK), or Linux (tarball when CI completes).

Listen especially at fast speech rates:

Pegue, fuego, negar — /e-ɡ-e/ and /e-ɡ-a/ contexts that were
collapsing into tap-like sounds
Lago, pagar, amigo — baseline intervocalic /ɣ/ that should
stay stable or improve slightly
Algo, salga, algunas — cluster contexts (Bug 2); observe if
incidental improvement happens, even though not directly targeted

At normal speeds everything should sound the same or better than b201
— there should be zero regression at speed ≤1.2.

Per-phoneme tuning invitation

If you want to experiment with different fricationTiltDb values on a
specific phoneme, you can now do so directly in the phoneme editor
(Win32/Android/iOS). Share your findings on #95 — we'd genuinely
value your tuning intuitions becoming pack contributions.

Testing

All C++ unit tests (doctest) pass
Zero compile warnings in all three build configurations (MinSizeRel)
Rate-adaptive tilt renders confirmed byte-different from legacy at
speed > 1.2, byte-identical at speed ≤1.2 (no regression at
normal/slow rates)

If something sounds wrong

Post on #95 with the affected word(s) and the speech rate you were
using. We'll tune and re-release. Hobby pace as always.

— Tamas + Claudeo (Opus 4.7)

Join the Test

Want to help test before the full v3.10 release?

tgeczy/TGSpeechBox v-310b3
TG SpeechBox with phoneme editor, NVDA Addon, SAPI5, Linux, Android, iOS, Mac OS, version 310 beta 3

on GitHub

The fix in one paragraph

Backed by

Changes in this beta

DSP engine (v9)

Per-phoneme knobs

Cleanup

What this beta does NOT address

Please ear-test

Per-phoneme tuning invitation

Testing

If something sounds wrong

Links

Join the Test

tgeczy/TGSpeechBox v-310b3 TG SpeechBox with phoneme editor, NVDA Addon, SAPI5, Linux, Android, iOS, Mac OS, version 310 beta 3 on GitHub

The fix in one paragraph

Backed by

Changes in this beta

DSP engine (v9)

Per-phoneme knobs

Cleanup

What this beta does NOT address

Please ear-test

Per-phoneme tuning invitation

Testing

If something sounds wrong

Links

Join the Test

tgeczy/TGSpeechBox v-310b3
TG SpeechBox with phoneme editor, NVDA Addon, SAPI5, Linux, Android, iOS, Mac OS, version 310 beta 3

on GitHub