This release contains the NVIDIA Parakeet TDT 0.6B v3 model converted to ONNX format with INT8 quantization for fast CPU inference. This model is passed to transcribe-rs for transcription in the Whispering app.
The original model is from NVIDIA's Parakeet with ONNX conversion by istupakov. We chose INT8 quantization for optimal CPU performance (over 4x smaller for decoder, over 15x smaller for encoder, faster inference). Licensed under CC-BY-4.0.
Download Options
Option 1: Complete Archive (Recommended)
Download the pre-packaged tar.gz that contains all model files:
curl -L -o parakeet-tdt-0.6b-v3-int8.tar.gz \
https://github.com/epicenter-md/epicenter/releases/download/models/parakeet-tdt-0.6b-v3-int8/parakeet-tdt-0.6b-v3-int8.tar.gz
tar -xzf parakeet-tdt-0.6b-v3-int8.tar.gz
Option 2: Individual Files
Download individual model files directly (useful for avoiding extraction issues):
- config.json (97 bytes)
- decoder_joint-model.int8.onnx (17.3 MB)
- encoder-model.int8.onnx (652 MB)
- nemo128.onnx (136 KB)
- vocab.txt (93.9 KB)
Build It Yourself
You can recreate this package by downloading the files from the source repository:
mkdir parakeet-tdt-0.6b-v3-int8 && cd parakeet-tdt-0.6b-v3-int8
curl -L -o config.json "https://huggingface.co/istupakov/parakeet-tdt-0.6b-v3-onnx/resolve/main/config.json?download=true"
curl -L -o decoder_joint-model.int8.onnx "https://huggingface.co/istupakov/parakeet-tdt-0.6b-v3-onnx/resolve/main/decoder_joint-model.int8.onnx?download=true"
curl -L -o encoder-model.int8.onnx "https://huggingface.co/istupakov/parakeet-tdt-0.6b-v3-onnx/resolve/main/encoder-model.int8.onnx?download=true"
curl -L -o nemo128.onnx "https://huggingface.co/istupakov/parakeet-tdt-0.6b-v3-onnx/resolve/main/nemo128.onnx?download=true"
curl -L -o vocab.txt "https://huggingface.co/istupakov/parakeet-tdt-0.6b-v3-onnx/resolve/main/vocab.txt?download=true"
cd .. && tar -czf parakeet-tdt-0.6b-v3-int8.tar.gz parakeet-tdt-0.6b-v3-int8/
The model files can be used directly with transcribe-rs for transcription.