Highlights
- Added OpenAI
gpt-realtime-whispersupport with live mic OSD transcript preview. - Reworked Realtime WebSocket setup into a simpler provider + model selection flow.
- Realtime WebSocket is no longer labeled experimental.
- Added automated Continue PR review through opub.
- Updated website dependency
devaluefrom5.6.4to5.8.1.
User-Facing Changes
-
Realtime setup is shorter and less confusing.
-
Selecting Realtime WS now shows a flat list like:
OpenAI: GPT Realtime Whisper
Google (Gemini): Gemini 3.1 Flash Live
Google (Gemini): Gemini 2.5 Flash Native Audio
ElevenLabs: Scribe v2 Realtime
Custom WebSocket endpoint -
Setup no longer asks users to choose Transcribe vs Converse.
-
Realtime WebSocket setup now defaults to transcription mode.
-
OpenAI realtime transcription can show partial text in the mic OSD before final paste.
-
Realtime WebSocket docs now describe OpenAI, Google, and ElevenLabs support as mature.
Developer / Maintenance Changes
- Added realtime preview tests.
- Added mic OSD daemon safety tests.
- Added startup guard tests around realtime partial callback registration.
- Added TRANSCRIPT_PREVIEW_FILE runtime path.
- Added pycairo to Python requirements and distro install docs/scripts.
- Updated config schema for realtime_transcription_delay.
Full Changelog: v1.29.4...v1.30.0