altunenes/parakeet-rs v0.2.3 on GitHub

What's Changed

This release introduces Stateful (Cache-Aware) Streaming for the parakeet_realtime_eou_120m-v1 model.
TL/DR:
Previously, my fast-paced streaming implementation used a stateless sliding window approach (re-processing the full buffer). This update switches to full Streaming, where the encoder preserves internal states (cache) between chunks. This results in significantly higher efficiency (processing per chunk) and better long-form transcription stability.

BREAKING CHANGE: Model Update Required for parakeet_realtime_eou_120m-v1

new ONNX graph signature has changed to accept cache tensors. You must re-download the model files to use v0.2.3. Old .onnx files will not work and will cause shape mismatch errors.

Download New Models: HuggingFace: parakeet-rs / realtime_eou_120m-v1-onnx

Required files: encoder.onnx, decoder_joint.onnx, tokenizer.json

PR:

migrate full cache-aware streaming by @altunenes in #22

Summary:

Implemented stateful encoder inference using cache_aware_stream_step.
Drastic reduction in CPU usage for long audio sessions; latency is now independent of buffer length.

Full Changelog: v0.2.2...v0.2.3