kizuna-ai-lab/sokuji v0.15.25 on GitHub

What's New

IBM Granite Speech — Local ASR + Direct Speech Translation (WebGPU)

IBM's Granite Speech model (1B parameters) is now available as a local ASR engine with WebGPU acceleration.

6 languages for transcription: English, French, German, Spanish, Portuguese, Japanese
8 languages for direct speech translation (AST): adds Italian and Chinese
AST mode: When supported, Granite can translate speech directly without a separate translation model — fewer downloads, lower latency
Requires WebGPU-capable browser

Participant Mode for Local Inference

Local inference now supports participant audio — translate what other people in the meeting are saying, not just your own voice.

Automatically selects reverse-direction ASR and translation models
Falls back to transcription-only mode when translation models aren't available
Works with system audio capture on supported platforms

Model Preferences Per Language Pair

Sokuji now remembers your model selections for each language pair. When you switch between language pairs, your preferred ASR, translation, and TTS models are automatically restored.

Settings UI Overhaul

Provider section redesigned with compact model info chips
Model tags are clickable — jump directly to model selection
New Help section with support email and GitHub Discussions links
"API Key" section renamed to "Provider" across all 30 languages

Bug Fixes

Fixed dismissed tutorials parsing crash in localStorage
Fixed various model auto-selection edge cases with AST-capable models

Install: Chrome Web Store · Website