What's New
Streaming transcription quality improvements
- Longer flush window — Default streaming flush interval increased from 3s to 5s, reducing word error rate by ~20% with minimal latency impact (#121)
- Cross-segment context carryover — Last 5 words from each transcribed segment are passed as decoder context to the next, reducing boundary errors at VAD-detected pauses. Active for WhisperKit via
promptTokens; Parakeet and Qwen3 accept the parameter for forward compatibility (#119)
Contributors
Thanks to @Newarr for proposing both improvements with detailed benchmark data and academic references.
Full Changelog: v1.18.0...v1.18.1