github epicenter-md/epicenter models/parakeet-tdt-0.6b-v3-int8
Parakeet TDT 0.6B v3 INT8 Model

latest releases: v7.7.0, v7.6.0, v7.5.5...
one month ago

This release contains the NVIDIA Parakeet TDT 0.6B v3 model converted to ONNX format with INT8 quantization for fast CPU inference. These model files are used by transcribe-rs for transcription in the Whispering app.

The original model is from NVIDIA's Parakeet with ONNX conversion by istupakov. We chose INT8 quantization for optimal CPU performance—the decoder is over 4x smaller (72.5 MB → 18.2 MB) and the encoder is over 15x smaller (2.44 GB → 652 MB), resulting in significantly faster inference without requiring GPU acceleration. Licensed under CC-BY-4.0.

How Whispering Uses These Files

Unlike Whisper models that use a single GGML file, Parakeet requires a structured directory with multiple ONNX files. When you click "Download" in Whispering's model settings, it automatically fetches each file and organizes them into the correct directory structure that transcribe-rs expects. You can see the implementation in LocalModelDownloadCard.svelte.

The INT8 quantization reduces the total model size to ~671 MB (down from 2.5 GB unquantized) while maintaining transcription quality, making it practical for CPU-only inference.

Build It Yourself

You can recreate this model directory by downloading the files directly from the source repository. The full list of files, including non-quantized versions, is available at istupakov/parakeet-tdt-0.6b-v3-onnx. The commands below mirror the INT8 quantized versions we use for optimal performance:

mkdir parakeet-tdt-0.6b-v3-int8 && cd parakeet-tdt-0.6b-v3-int8

curl -L -o config.json "https://huggingface.co/istupakov/parakeet-tdt-0.6b-v3-onnx/resolve/main/config.json?download=true"
curl -L -o decoder_joint-model.int8.onnx "https://huggingface.co/istupakov/parakeet-tdt-0.6b-v3-onnx/resolve/main/decoder_joint-model.int8.onnx?download=true"
curl -L -o encoder-model.int8.onnx "https://huggingface.co/istupakov/parakeet-tdt-0.6b-v3-onnx/resolve/main/encoder-model.int8.onnx?download=true"
curl -L -o nemo128.onnx "https://huggingface.co/istupakov/parakeet-tdt-0.6b-v3-onnx/resolve/main/nemo128.onnx?download=true"
curl -L -o vocab.txt "https://huggingface.co/istupakov/parakeet-tdt-0.6b-v3-onnx/resolve/main/vocab.txt?download=true"

The resulting directory can be used directly with transcribe-rs for transcription by pointing to the directory path.

Don't miss a new epicenter release

NewReleases is sending notifications on new releases.