TGSpeechBox v3.10 Beta 2.0.1 — actual /ɣ/ fix for #84/#95 (architectural, not parametric)
This is the THIRD attempt at solving the perceptual /ɣ/ inaudibility from
issues #84 and #95 ("entregado" sounding like "entredado" / "entrelado").
The story so far:
- b1: shipped /ɣ_es/ approximant tuning. Testers said no improvement.
- b101 (informal): /l_es/ lateral antiresonance. Improved /l/ but not /ɣ/.
- b2: course-corrected to softer /ɣ/ approximant via amplitude knobs.
Testers said no improvement. - b201 (informal): /ɣ_es/ amplitude refinement. Testers said no improvement.
We were doing parametric tuning when the problem was architectural. This
beta ships the architectural fix.
The architectural finding
/ɡ_es/ (the softened velar stop variant we use for Spanish /ɣ/) had
durationScale: 0.27 to give a short ~8ms intervocalic closure (Spanish
lenition without an audible word-break). The intent was right; the
mechanism was wrong.
durationScale was coupling two things that need to vary independently:
the closure-gap length AND the stop-body length. So when we asked for an
8ms closure, the burst body also got chopped to 27% of normal — killing
the actual stop sound. The "softened velar" was an approximant in disguise.
A diagnostic render with full /ɡ/ + global 8ms closure (hardG_gap8.wav)
sounded right. The same parameters via /ɡ_es/ did not. That mismatch
told us where to dig.
The fix
New per-phoneme closureGapMs field on PhonemeDef. Decouples closure
timing from body length. Wiring:
/ɡ_es/updated: removeddurationScale, addedclosureGapMs: 8.
Result: 8ms closure + full burst body. Sounds like a real stop.es.yaml:/ɣ/ → /ɡ_es/for intervocalic contexts.- Cluster contexts (
lᵊɣ,ɾᵊɣ,sᵊɣ,ɣɾ): collapse to plain/ɡ/
(30ms closure) instead. Why: 8ms after a voiced schwa gets masked by
the schwa's voicing and reads as approximant /ð/ ("algo" → "al-dho").
Plain /ɡ/'s longer closure has enough silence to break voicing in
cluster contexts.
Verification
42 WAV files rendered, ear-tested by Tamas + spectrogram-analyzed by
Claudeo (tools/analyze_g_closure.py). New /ɡ_es/ matches the
known-good hardG_gap8.wav reference within ~1 dB on every closure
metric (depth %, depth dB, burst-rise dB, spectral COG), cleanly
separated from the failed-attempt references.
Words confirmed sounding right:
- intervocalic (/ɡ_es/, 8ms closure): entregado, lago, hago, haga, diga,
amigo, pagar, agua, luego - cluster (plain /ɡ/, 30ms closure): algo, largo, rasgar, regla
Word-initial /ɣ/ (e.g. "gusano") was already routed to plain /ɡ/ before
this beta — unchanged.
What this beta does NOT fix
- "siguiente" still leans toward "sillente" — that's a /ʝ/ tuning
issue, tracked under #20. Independent of /ɣ/. - "entrelado" /l/-vs-/d/ — the b101 lateral antiresonance helped
significantly but not 100%. Independent of /ɣ/. Will revisit.
Testing
- 49 C++ unit tests (doctest) all pass — including the 18 pre-existing
/ɣ/-trace tests, updated to find /ɡ_es/ post-replacement. - New
Hypothesis checktest asserts the new /ɡ_es/ matches the
hardG_gap8 reference acoustically. - New
Audittest renders 13 Spanish words covering intervocalic +
cluster + special-case contexts. - papa/tapa renders byte-identical to pre-fix (no override on /p,t,k/).
- Python pytest suite still passes.
Please ear-test
Native Spanish speakers, especially @gregodejesus2, @29-Bloo,
@rmcpantoja, @yaresDg, @dgomez42 — please test on Windows (NVDA or
SAPI) or Android (APK).
Comparison test words:
- entregado, lago, hago, amigo, agua (pure intervocalic — should sound
natural, with audible velar character but no word-break) - algo, largo, rasgar (cluster — should have a clearer hard-G feel)
- regla, siguiente (control — unchanged behavior, /ɣ/ not in effect)
If something sounds wrong
Post on #84 or #95 with the affected word(s). We'll tune and re-release.
— Tamas + Claudeo (Opus 4.7)