tgeczy/TGSpeechBox v-275 on GitHub

TGSpeechBox, Rename & Migration Release!

This fork of NV Speech Player is now TGSpeechBox.

After growing from 4,800 lines to 42,000+ lines across 148 files — with a fully rewritten DSP engine, 26 language packs, Fujisaki pitch modeling, coarticulation, voice profiles, and FrameEx voice quality parameters — this project has long outgrown its origins. The rename reflects that reality and puts us towards a start on the path at potentially resolveing GPL v2 licensing constraints for app store distribution.

What changed

The project has been renamed from NV Speech Player to TGSpeechBox across the entire codebase. This is a naming change only — all speech synthesis behavior, voice quality, and settings are identical.

For existing users

Your settings are automatically preserved. On first launch after updating the NVDA add-on, TGSpeechBox detects your saved NV Speech Player configuration (voice, pitch, rate, language, and all 50+ custom parameters) and migrates it to the new config section. A one-time dialog confirms the migration succeeded.

If you still have the original NV Access NV Speech Player add-on installed alongside TGSpeechBox, both can coexist — they use separate config sections and separate synth driver folders. You can remove the old add-on from NVDA menu → Tools → Manage add-ons when you're ready.

Linux renderer gets symlink treatment

On Linux, the names for the library and renderer have changed. tgsBRender for the speechBox renderer, and just tgsp for the wrapper. These the install.sh script will take care for you in remapping correctly, so a relaunch of your Speech Dispatcher should not break afterwards. It's a good idea to use the newly included .conf file and place it in where your Speech-dispatcher .conf files reside though, so the paths can later be deleted. The installer will recreate the symlinks for the next releases to allow for some adjusting time. A symlink costs virtually zero disk space. Peace of mind that old confs don't break? That's priceless.

Renamed across the codebase

NVDA synth driver: internal name tgSpeechBox, display name "TGSpeechBox"
NVDA add-on manifest: name and summary updated to TGSpeechBox
Settings panel: appears as "TGSpeechBox Language Packs" in NVDA Settings
C/C++ source headers: updated with TGSpeechBox attribution, preserving NV Access original credit
Include guards: SPEECHPLAYER_*_H → TGSPEECHBOX_*_H throughout
Phoneme editor: window titles and UI strings updated to "TGSB Phoneme Editor"
Linux renderer: header updated to "TGSBRender (formerly nvspRender)"
Log prefixes: all NVDA driver log messages now prefixed "TGSpeechBox:"
Documentation: Developers.md, Tuning.md, README files all updated
GitHub URLs: references point to tgeczy/TGSpeechBox

Preserved for compatibility

DLL filenames: speechPlayer.dll and nvspFrontend.dll are unchanged — existing language packs, voice profiles, and phoneme data continue to work without modification
C++ namespace names: nvsp_frontend, nvsp_editor retained for API stability
Build targets: CMake and Makefile output names unchanged

Config migration details

The migration module (migrate_config.py) runs at synth initialization before NVDA loads settings. It uses NVDA's isSet() API to check for real saved data (not config spec defaults), copies all keys from the base profile's [[nvSpeechPlayer]] section to [[tgSpeechBox]], invalidates the config cache, and persists immediately. The old config section is left intact so the original NV Access add-on continues to work if installed.

Compatible with NVDA 2023.2 through 2026.1 beta.

Download URL

If you have a prior version configured to update from the old download URL, a 301 redirect is in place. No manual URL changes are needed.

Resonators rewritten: biquad → trapezoidal SVF

The cascade and parallel formant resonators have been completely replaced. The classic Klatt biquad (Direct Form II) is gone, replaced by an implementation based on Andrew Simper's trapezoidal-integrated state-variable filter method (Cytomic).

Why this matters

Zipper-free formant sweeps. Frequency and Q are now independent parameters. Continuously varying formants during coarticulation and diphthongs no longer risk coefficient discontinuities or clicks.
Better stability at low sample rates. The old biquad lost coefficient precision near Nyquist, contributing to "old cell phone" artifacts at 11025 Hz. The SVF spreads precision more evenly across the spectrum.
Unconditional stability. No NaN explosions from coefficient edge cases because the trapezoidal integrator is stable for all parameter values.
The anti-resonator (nasal zero) uses a dedicated FIR all-zero filter rather than the SVF notch output. The SVF notch places zeros on the unit circle (infinitely deep null), which is too aggressive for speech nasalization. The FIR places zeros inside the unit circle at a depth controlled by bandwidth, matching the behavior expected by existing phoneme data.
Frame crossfades also moved from linear interpolation to cosine smoothing, and formant frequencies now interpolate in the log domain, so a sweep from 300 Hz to 2400 Hz passes through ~849 Hz at the midpoint (geometric mean) rather than 1350 Hz (arithmetic mean), producing more natural diphthong and CV transitions.

Attribution

TGSpeechBox is originally based on NV Speech Player by NV Access Limited (2014). Extended 2025–2026 by Tamas Geczy. Licensed under GNU GPL v2.0.

The original NV Access version (UK English only, classic sawtooth DSP) remains available at:
https://github.com/nvaccess/nvSpeechPlayer

tgeczy/TGSpeechBox v-275 TG SpeechBox with phoneme editor and NVDA Addon, Speech Dispatcher module version 2.75 on GitHub