tgeczy/TGSpeechBox v-310b201 on GitHub

TGSpeechBox v3.10 Beta 2.0.1 — actual /ɣ/ fix for #84/#95 (architectural, not parametric)

This is the THIRD attempt at solving the perceptual /ɣ/ inaudibility from
issues #84 and #95 ("entregado" sounding like "entredado" / "entrelado").

The story so far:

b1: shipped /ɣ_es/ approximant tuning. Testers said no improvement.
b101 (informal): /l_es/ lateral antiresonance. Improved /l/ but not /ɣ/.
b2: course-corrected to softer /ɣ/ approximant via amplitude knobs.
Testers said no improvement.
b201 (informal): /ɣ_es/ amplitude refinement. Testers said no improvement.

We were doing parametric tuning when the problem was architectural. This
beta ships the architectural fix.

The architectural finding

/ɡ_es/ (the softened velar stop variant we use for Spanish /ɣ/) had
durationScale: 0.27 to give a short ~8ms intervocalic closure (Spanish
lenition without an audible word-break). The intent was right; the
mechanism was wrong.

durationScale was coupling two things that need to vary independently:
the closure-gap length AND the stop-body length. So when we asked for an
8ms closure, the burst body also got chopped to 27% of normal — killing
the actual stop sound. The "softened velar" was an approximant in disguise.

A diagnostic render with full /ɡ/ + global 8ms closure (hardG_gap8.wav)
sounded right. The same parameters via /ɡ_es/ did not. That mismatch
told us where to dig.

The fix

New per-phoneme closureGapMs field on PhonemeDef. Decouples closure
timing from body length. Wiring:

/ɡ_es/ updated: removed durationScale, added closureGapMs: 8.
Result: 8ms closure + full burst body. Sounds like a real stop.
es.yaml: /ɣ/ → /ɡ_es/ for intervocalic contexts.
Cluster contexts (lᵊɣ, ɾᵊɣ, sᵊɣ, ɣɾ): collapse to plain /ɡ/
(30ms closure) instead. Why: 8ms after a voiced schwa gets masked by
the schwa's voicing and reads as approximant /ð/ ("algo" → "al-dho").
Plain /ɡ/'s longer closure has enough silence to break voicing in
cluster contexts.

Verification

42 WAV files rendered, ear-tested by Tamas + spectrogram-analyzed by
Claudeo (tools/analyze_g_closure.py). New /ɡ_es/ matches the
known-good hardG_gap8.wav reference within ~1 dB on every closure
metric (depth %, depth dB, burst-rise dB, spectral COG), cleanly
separated from the failed-attempt references.

Words confirmed sounding right:

intervocalic (/ɡ_es/, 8ms closure): entregado, lago, hago, haga, diga,
amigo, pagar, agua, luego
cluster (plain /ɡ/, 30ms closure): algo, largo, rasgar, regla

Word-initial /ɣ/ (e.g. "gusano") was already routed to plain /ɡ/ before
this beta — unchanged.

What this beta does NOT fix

"siguiente" still leans toward "sillente" — that's a /ʝ/ tuning
issue, tracked under #20. Independent of /ɣ/.
"entrelado" /l/-vs-/d/ — the b101 lateral antiresonance helped
significantly but not 100%. Independent of /ɣ/. Will revisit.

Testing

49 C++ unit tests (doctest) all pass — including the 18 pre-existing
/ɣ/-trace tests, updated to find /ɡ_es/ post-replacement.
New Hypothesis check test asserts the new /ɡ_es/ matches the
hardG_gap8 reference acoustically.
New Audit test renders 13 Spanish words covering intervocalic +
cluster + special-case contexts.
papa/tapa renders byte-identical to pre-fix (no override on /p,t,k/).
Python pytest suite still passes.

Please ear-test

Native Spanish speakers, especially @gregodejesus2, @29-Bloo,
@rmcpantoja, @yaresDg, @dgomez42 — please test on Windows (NVDA or
SAPI) or Android (APK).

Comparison test words:

entregado, lago, hago, amigo, agua (pure intervocalic — should sound
natural, with audible velar character but no word-break)
algo, largo, rasgar (cluster — should have a clearer hard-G feel)
regla, siguiente (control — unchanged behavior, /ɣ/ not in effect)

If something sounds wrong

Post on #84 or #95 with the affected word(s). We'll tune and re-release.

— Tamas + Claudeo (Opus 4.7)

tgeczy/TGSpeechBox v-310b201 TG SpeechBox with phoneme editor, NVDA Addon, SAPI5, Linux, Android, iOS, Mac OS, version 310 beta 2.0.1 on GitHub