tgeczy/TGSpeechBox v-310b2 on GitHub

TGSpeechBox v3.10 Beta 2 — reverted b101 /l/ experiment, shipping correct /ɣ/ approximant tuning

This beta replaces the withdrawn v3.10 Beta 1.1 (b101), which introduced a
Spanish /l/ anti-resonance that multiple native-speaker testers reported as
catastrophic (hola→hoda, "La le li lo lu" → "ga gue gui go gu", /l/ losing
its lateral character entirely). You were right — that approach was the
wrong tool for the job. Rolling it back and shipping the correct fix based
on a deeper read of the acoustic-phonetics literature.

What this beta changes vs b1

Reverted: the /l_es/ lateral anti-resonance from b101 (the whole
caNP + cfN0=1700 path) — back to pre-b101 values. /l/ should now sound
exactly as it did in b1, which testers confirmed was working.

Fixed: the underlying /ɣ/↔/l/ perceptual collapse from a different
angle — by treating Spanish intervocalic /ɣ/ as what it actually is
acoustically: the voiced velar APPROXIMANT [ɣ̞], not a fricative. Diego
Gomez's original #84 diagnosis pointed this out and we initially went the
wrong direction (sharpening /l/ instead of softening /ɣ/). This beta
corrects course.

The acoustic theory

The confusion in words like "entregado" stems from our /ɣ/ being too
formant-like. Per Martínez-Celdrán and Mackenzie, Spanish intervocalic /ɣ/
is phonetically [ɣ̞], an approximant with near-zero turbulent frication.
The lenition cue is intensity difference relative to flanking vowels
(Kingston 2008), not frication noise. So: lower the F2 into actual
approximant range, drop the frication to near-zero, drop voice amplitude
to carry the IntDiff cue.

Parameter changes on /ɣ_es/

cf2: 1450 → 1250 (approximant range; Fant 1960 [x]=1050)
pf2: 1400 → 1250 (match cascade)
fricationAmplitude: 0.20 → 0.08 (near-zero turbulence)
voiceAmplitude: 0.82 → 0.70 (intensity-based lenition cue)

Acoustic confirmation via LPC

A new regression test measures the F2 gap between /ɣ/ and /l/ in minimal-
pair word contexts at the 20 ms acoustic-invariance window (Blumstein &
Stevens 1979 — where listeners form categorical consonant judgments):

/ɣ/ F2 = 1161 Hz (canonical approximant range)
/l/ F2 = 1608 Hz (matches Kirkham et al. 2019 Spanish /l/ ≈1583 Hz)
ΔF2 = 446 Hz (4.5× above perceptual JND of 100 Hz)

Before b2, that same measurement showed only a 56 Hz gap — below perceptual
threshold, which is exactly why native testers couldn't distinguish the two.

Please ear-test

@gregodejesus2, @rmcpantoja, @yaresDg, @dgomez42, @29-Bloo — please test
on Windows (NVDA or SAPI), Android (APK), or Linux (tarball):

Test words:

entregado, diálogo, lugar, agua, laguna, jugar, código (/ɣ/ examples
that were previously heard as /l/-like or too weak)
hola, lunes, alga, olvido, final, almacén (/l/ should be back to how
it sounded in b1)
siguiente (a front-vowel /ɣ/ context — we stayed above the historic
cf2=1200 /u/-coloring floor)

Listen for:

Is /ɣ/ now audibly different from /l/ in connected speech?
Does /l/ sound like natural Spanish /l/ again (b1-style)?
Any "/u/-coloring" on /ɣ/ before front vowels like /i/ and /e/?
Any other regressions vs b1?

What's NOT in this beta

Scheduled for later v3.10 beta builds:

fricationTiltDb FrameEx field (coming in b3) — will enable dialectal /s/
brightness difference between Mexican (laminal, brighter) and Castilian
(apical, darker). Addresses #74 and #81.
endCb1/2/3 FrameEx fields (coming in b4) — per-phoneme bandwidth
evolution within a phone. Enables narrower /l/ B2 at steady state,
wider at boundaries (Stevens 1998).
Echo at slow rates (#98)
Currency text processing (#83)
Android Engine-tab language selector (#97)

Testing

44 C++ unit tests (doctest) + all Python tests pass
New regression test locks in the /ɣ/↔/l/ F2 separation invariant
No collateral damage to previous passing tests

If something sounds wrong

Post on #95 with the word(s) affected and what you hear vs what you'd
expect. Scientific approach is better than guessing — we can often turn
your reports into measurable regression tests.

— Tamas + Claudeo (Opus 4.7)

tgeczy/TGSpeechBox v-310b2 TG SpeechBox with phoneme editor, NVDA Addon, SAPI5, Linux, Android, iOS, Mac OS, version 310 beta 2 on GitHub