What's new in v0.25.0
This release brings OpenAI's two newest realtime voice models to Sokuji.
✨ New: OpenAI Translate provider
Dedicated speech-to-speech translation via OpenAI's brand-new gpt-realtime-translate model.
This isn't another general realtime model with a "you are a translator" prompt — it's a purpose-built translation endpoint that streams translated audio continuously alongside synchronized source and target transcripts. No risk of the model interpreting your speech as a question and answering it instead of translating, no server-side turn detection to fight, and noticeably lower latency than chaining transcription + chat + TTS.
- 13 target languages · 75 source languages (auto-detected)
- Both WebSocket and WebRTC transports
- Push-to-Translate Speech Mode supported
- Independent Source pause and Translation pause sliders (0.1s – 3.0s) for fine-tuning how the UI splits messages
- Participant translation supported, with a warning when the selected source language falls outside the 13 model-supported targets
🚀 New realtime model: gpt-realtime-2
gpt-realtime-2 is now selectable for the OpenAI provider, bringing the new reasoning.effort parameter (minimal → low → medium → high → xhigh). Crank it up for high-stakes translation; dial it down for latency-sensitive conversation. Same model, configurable per session.
The transcription-only gpt-realtime-whisper is filtered out of the voice-agent model list — it doesn't fit the end-to-end speech-to-speech flow.
🌐 Localization
UI strings for the new OpenAI Translate provider added across all 30 supported locales.
Full Changelog: v0.24.0...v0.25.0