What's Changed
- Improved streaming transcription accuracy: Increased flush interval from 5s to 10s, giving the conformer encoder more context per segment. Benchmarked WER improvements: 13.6% → 8.8% (Polish), 28.7% → 12.3% (English). VAD speechEnd events still flush on natural pauses, so short utterances appear without delay. (#174)
- Fixed minimum speech segment length: Raised minimumSpeechSamples to match Parakeet TDT's actual 1s minimum input requirement, eliminating unreliable output on very short segments. (#174)
Contributors
Thanks to @Newarr for the thorough benchmarking and contribution!