github tgeczy/TGSpeechBox v-250
NVSpeech Player with phoneme editor and NVDA Addon, Speech Dispatcher module version 2.5

latest releases: v-300b7, v-300b6, v-300b5...
one month ago

NV Speech Player 2.5 Release Notes

This is a major feature release introducing DSP Version 5, bringing significant improvements to voice quality control, new per-frame voice parameters, and cross-platform tooling updates.


Highlights

  • now 12 VoicingTone parameters for global voice character shaping
  • 5 new FrameEx parameters for per-frame voice quality (creaky voice, breathiness, jitter, shimmer, glottal sharpness)
  • 10 real-time sliders in NVDA settings for voice tuning
  • Phoneme Editor with full DSP V5 UI support
  • Linux renderer updated with all new parameters
  • Backward compatible — older drivers work with new DLLs and vice versa, although they ** will not ** apply new frame paremeters to it or voicing tone.

DSP Engine (speechPlayer.dll)

VoicingTone V3 — Two New Parameters

The VoicingTone struct now includes 12 parameters (up from 10 in V2):

Parameter Range Description
speedQuotient 0.5–4.0 Glottal pulse asymmetry. Lower values create softer, more relaxed voices; higher values create sharper, more pressed voices. Default 2.0.
aspirationTiltDbPerOct -12 to +12 Spectral tilt for aspiration/breath noise. Independent of voicedTiltDbPerOct. Negative = muffled, darker breath, positive = crisper consonants

FrameEx — Per-Frame Voice Quality

New speechPlayer_queueFrameEx() API enables voice quality variations within speech, essential for features like Danish stød (creaky voice) and natural voice modeling:

Parameter Range Description
creakiness 0.0–1.0 Laryngealization / creaky voice. Adds pitch irregularity and tighter glottal closure.
breathiness 0.0–1.0 Additional voiced breathiness independent of frame's voiceTurbulenceAmplitude.
jitter 0.0–1.0 Pitch perturbation (cycle-to-cycle F0 variation).
shimmer 0.0–1.0 Amplitude perturbation (cycle-to-cycle intensity variation).
sharpness 0.5–2.0 Glottal closure sharpness multiplier. Higher values create crisper, more "Eloquence-like" attacks.

The FrameEx struct is optional — pass NULL to queueFrameEx() for default behavior, or continue using queueFrame() for full backward compatibility.

ABI Stability

  • Original 47-parameter Frame struct unchanged
  • VoicingTone uses magic number + version header for safe detection
  • FrameEx includes frameExSize parameter for forward compatibility
  • Older drivers work seamlessly with new DLLs. New driver still works with SpeechPlayer DLL provided by NVAccess in 2014.

NVDA Add-on

10 Voice Tuning Sliders

The driver now exposes 10 sliders in NVDA's voice settings (up from 4):

VoicingTone Sliders (global voice character):

  • Voice tilt (brightness)
  • Noise glottal modulation
  • Pitch-sync F1 delta
  • Pitch-sync B1 delta
  • Speed quotient (new)

FrameEx Sliders (per-frame voice quality):

  • Creakiness (new)
  • Breathiness (new)
  • Jitter (new)
  • Shimmer (new)
  • Glottal sharpness (new)

All slider values are stored per-voice, so different voice profiles can have different settings.

Bug Fixes

  • Fixed DLL unloading for nvspFrontend.dll — both DLLs now properly release file locks on termination, allowing add-on updates without restarting NVDA.

Phoneme Editor (Windows)

New Voice Quality Dialog Section

The Speech Settings dialog now includes a "Voice Quality (FrameEx)" section with:

  • 5 parameter sliders: Creakiness, Breathiness, Jitter, Shimmer, Sharpness
  • Reset buttons: Per-parameter and reset-all
  • Live preview: Hear changes immediately when speaking phonemes

Full DSP V5 Support

  • All 12 VoicingTone parameters accessible
  • All 5 FrameEx parameters accessible
  • Both speaking paths (Synth IPA and single phoneme preview) use queueFrameEx
  • Proper default handling (sharpness defaults to 50/neutral, others to 0/off)

Linux Renderer (nvspRender)

New Command-Line Parameters

The renderer now supports all DSP V5 features via command line:

Basic options:

--voice <name>        Voice profile from phonemes.yaml
--samplerate <hz>     Output sample rate (fixed: was ignoring this before!)

VoicingTone parameters (0-100 sliders):

--voicing-peak-pos    --voiced-preemph-a    --voiced-preemph-mix
--high-shelf-gain     --high-shelf-fc       --high-shelf-q
--voiced-tilt         --noise-glottal-mod   --pitch-sync-f1
--pitch-sync-b1       --speed-quotient      --aspiration-tilt

FrameEx parameters (0-100 sliders):

--creakiness          --breathiness         --jitter
--shimmer             --sharpness

Bug Fix

  • Fixed sample rate handling — --samplerate was being ignored (hardcoded to 16000). Now properly respected.

Speech Dispatcher Integration

Updated nvsp-generic.conf:

  • Voice profile support via $VOICE variable
  • Corrected sample rate in comments
  • Example entries for custom voice profiles (Adam, Benjamin, Caleb)

Voice Profiles

Voice profiles defined in phonemes.yaml now work across all platforms:

voiceProfiles:
  Adam:
    cf1_mul: 1.0
    cf2_mul: 0.95
    voicingTone:
      speedQuotient: 2.2
      voicedTiltDbPerOct: -3.0

Profiles can include:

  • Formant scaling/overrides
  • VoicingTone parameters
  • Any frame parameter multipliers

Technical Details

Version Numbers

Component Version
DSP Version 5
VoicingTone Struct Version 3
VoicingTone Magic 0x32544F56 ("VOT2")

Struct Sizes

Struct Fields Size
Frame 47 doubles 376 bytes
FrameEx 5 doubles 40 bytes
VoicingTone 4 uint32 + 12 doubles 112 bytes

Upgrade Notes

For NVDA Users

  1. Install the new add-on over the existing one
  2. Restart NVDA
  3. New sliders appear automatically in voice settings
  4. Existing voice settings are preserved

For Developers

  • queueFrame() continues to work unchanged
  • queueFrameEx() is additive — use it only when you need FrameEx features
  • Check hasFrameExSupport() before using FrameEx API
  • VoicingTone V3 is backward compatible with V2 DLLs (extra fields ignored)

For Linux Users

  1. Rebuild nvspRender with the new source, or use the libs in the prebuilt .gz archive.
  2. Update nvsp-generic.conf if using Speech Dispatcher
  3. New parameters are opt-in via command line flags

Contributors

  • Tamas Geczy — maintainer
  • Claude (Anthropic) — development assistance. Big research questions for Glottal changes was helped by with OpenAI's GPT.
  • Shoutout goes to Cleverson for his continued dedication to Portuguese support.

Don't miss a new TGSpeechBox release

NewReleases is sending notifications on new releases.