github kizuna-ai-lab/sokuji v0.30.0

4 hours ago

What's new

This release is a UI/UX redesign of how you choose what gets translated. Translating other participants' audio was already possible — but it was buried in the last settings section with no presence on the main panel, and your "mode" was something you had to infer from three separate device icons. v0.30.0 turns that into three explicit, one-click modes.

Highlights

🎛️ Three clear translation modes

Your session mode is now stated explicitly in the footer and switchable in one click (before a session starts):

  • Speaker — you speak, others hear the translation (translate your voice out)
  • Participant — translate other people's audio into subtitles for you (e.g. follow a foreign-language meeting by reading along)
  • Both — bidirectional

None of these are new capabilities — Participant translation already worked, it was just hard to find. What's new is that the modes are named, shown, and switchable from the main panel via a segmented Mode Picker. Re-click the active mode to open an inline device popover and configure that mode's mic / participant / output devices without leaving the panel.

📊 Three live audio meters

Advanced mode now shows exactly the streams the active mode uses — your mic, incoming participant audio, and the translation output sent to the virtual microphone — each with a hover tooltip explaining what it is.

🔘 Honest On/Off controls

Channel toggles were labeled "Mute" but actually gate our processing pipeline — your system audio still plays through the OS regardless — which was misleading. They're now plain On/Off, and which channels are in scope is set only by the Mode Picker, so flipping a channel can no longer silently collapse your mode. Device pickers that did nothing useful (the participant "source" on desktop was a single fixed entry) were removed, and on the browser extension passthrough now keeps playing regardless of On/Off and follows your OS default output device.

🎧 Push-to-Translate passthrough, clarified

When Push-to-Translate is active it manages original-audio passthrough for you (on at 100% while idle, off while you hold the key). The passthrough control now reflects that — shown On + managed, with the misleading 0–60% slider hidden — with the explanation localized in every language.

Under the hood

  • Leaner Participant-only sessions: previously the speaker channel was always opened even when you only wanted to translate others — an unused connection, and billed token usage for Kizuna AI on a channel you never used. Either channel can now be the sole channel, so Participant mode no longer starts the speaker client.
  • Output meter reflects the real output: shows the signal actually sent to the virtual mic (translation + passthrough), independent of monitor volume.
  • Karaoke fixes: no more double-highlight; the highlight clears the moment a line completes.
  • Participant rows are text-only — no misleading play button on content with no audio track.
  • Passthrough reliability: recovers automatically from a stuck AudioContext, with adaptive catch-up and producer-silence skipping (#246).
  • Analytics: each session records which channels were used.

Installation

Platform Asset
macOS (Apple Silicon) Sokuji-0.30.0-arm64.pkg
macOS (Intel) Sokuji-0.30.0-x64.pkg
Windows Sokuji-0.30.0.Setup.exe
Linux (.deb, x64) sokuji_0.30.0_amd64.deb
Linux (.deb, arm64) sokuji_0.30.0_arm64.deb
Linux (AppImage, x64) Sokuji-0.30.0-x86_64.AppImage
Linux (AppImage, arm64) Sokuji-0.30.0-arm64.AppImage
Browser extension (Chrome/Edge) sokuji-extension-0.30.0.zip

Existing installations on macOS / Windows auto-update on next launch.


Full change log: v0.28.0…v0.30.0

Don't miss a new sokuji release

NewReleases is sending notifications on new releases.