NV Speech Player Release Notes 2.0
We're calling this 2.0 because another voice change has occurred. People may or may not like it, but they sure can tweak it!
Highlights
- Hybrid glottal source - Dramatically improved audio quality at 11025 Hz sample rate
- VoicingTone v2 API - 3 new DSP parameters with backward-compatible versioning
- Phoneme Editor voicing controls - Full UI for all 10 voicing parameters with per-voice storage
- NVDA addon voicing sliders - New sliders in synth settings for voice quality adjustment
DSP Engine (speechPlayer.dll / libspeechPlayer.so)
Hybrid Glottal Source
The synthesizer now uses a sample-rate adaptive glottal waveform that selects the optimal pulse shape based on output rate:
| Sample Rate | Waveform | Character |
|---|---|---|
| 11025 Hz | Symmetric cosine | Fuller sound, no aliasing artifacts |
| 16000+ Hz | LF-inspired asymmetric | Rich harmonics, Eloquence-like buzz |
Why this matters: The previous LF-inspired pulse generated strong high-frequency harmonics that sounded great at 16 kHz but caused harsh aliasing artifacts at 11025 Hz (common for accessibility/low-bandwidth use). The hybrid approach gives you the best of both worlds.
VoicingTone v2 Struct
The VoicingTone API now supports 10 parameters (up from 7) with a version detection header for backward compatibility.
New Parameters
| Parameter | Description | Range | Default |
|---|---|---|---|
noiseGlottalModDepth
| Klatt-style amplitude modulation of noise sources (aspiration/frication) synced to glottal cycle | 0.0–1.0 | 0.0 |
pitchSyncF1DeltaHz
| F1 frequency shift during glottal open phase | -60 to +60 Hz | 0.0 |
pitchSyncB1DeltaHz
| B1 bandwidth widening during glottal open phase | -50 to +50 Hz | 0.0 |
The pitch-sync parameters model the acoustic coupling between glottal source and vocal tract, enabling Eloquence-like "buzzy clarity" effects.
Version Detection Header
The v2 struct includes a header for safe negotiation between callers and DLL:
typedef struct {
uint32_t magic; // 0x32544F56 ("VOT2")
uint32_t structSize; // sizeof(speechPlayer_voicingTone_t)
uint32_t structVersion; // 2
uint32_t dspVersion; // 4
// ... 10 double parameters ...
} speechPlayer_voicingTone_t;Backward compatibility:
- Old callers passing v1 struct (7 doubles, no header) → DLL detects missing magic, uses legacy layout
- New callers with old DLL → DLL ignores unknown fields, uses what it understands
- New
speechPlayer_getDspVersion()export for runtime feature detection
NVDA Addon
New Voicing Tone Sliders
The synthesizer settings ring now includes 4 voicing parameter sliders:
| Slider | Range | Default | Effect |
|---|---|---|---|
| Voiced Tilt | 0–100 | 50 | Spectral slope (±24 dB/oct) |
| Noise Glottal Mod | 0–100 | 0 | Noise pulsing with voicing |
| Pitch-Sync F1 Delta | 0–100 | 50 | F1 modulation depth |
| Pitch-Sync B1 Delta | 0–100 | 50 | B1 modulation depth |
Settings are saved per-voice in the NVDA configuration.
Driver Changes
- Updated
speechPlayer.pywith VoicingTone v2 ctypes struct - this new NVDA driver will not sound good with Voicing Tone V1 DLLs, but it will not crash either.
Phoneme Editor (nvspPhonemeEditorWin32)
Voicing Parameters UI
The Speech Settings dialog now includes a Voicing Parameters section with:
- Parameter listbox - All 10 voicing parameters
- Value slider - 0–100 range mapped to each parameter's actual range
- Reset button - Reset selected parameter to default
- Reset All button - Reset all voicing parameters to defaults
Per-Voice Storage
Voicing parameters are saved separately for each voice:
- Python preset voices (Adam, Benjamin, etc.) → Saved to INI file under
[voice_Adam],[voice_Benjamin], etc. - YAML profile voices → Stored in the profile's
voicingTone:section
Switching voices in the dropdown automatically loads that voice's saved voicing parameters.
3-Tier DLL Fallback
The editor gracefully handles different DLL versions:
| DLL Exports | Detection | Behavior |
|---|---|---|
setVoicingTone + getDspVersion
| V2 | Send header + 10 params |
setVoicingTone only
| V1 | Send 7 params, no header |
| Neither | None | Skip VoicingTone, speech still works |
Bug Fixes
- Fixed crash when opening Voice Profiles dialog (missing function alias, struct field mismatches)
- Fixed type conversion errors in profile editor YAML parsing
Linux Support
New Build Artifacts
libspeechPlayer.so- DSP engine shared library (x86_64)libnvspFrontend.so- IPA-to-frames frontend shared library (x86_64)
Built with GCC C++17, -O2 -fPIC.