Minor Changes
-
#1293
16769b0Thanks @threepointone! - Switch to per-call continuous STT sessions. Breaking API change.The transcriber session is now created at
start_calland lives for the entire call duration. The model handles turn detection — no client-sidestart_of_speech/end_of_speechrequired for STT. Voice agents usekeepAliveto prevent DO eviction during calls.New API:
transcriberproperty replacesstt,streamingStt, andvadcreateTranscriber(connection)hook for runtime model switchingWorkersAIFluxSTT— per-call Flux sessions (recommended forwithVoice)WorkersAINova3STT— per-call Nova 3 streaming sessions (recommended forwithVoiceInput)queryoption onVoiceClientOptions— pass query params to the WebSocket URL (e.g. for model selection)- Throws at
start_callif no transcriber is configured - Duplicate
start_callis silently ignored when already in a call
Removed:
stt(batch STT),streamingStt(per-utterance streaming),vad(server-side VAD)WorkersAISTT,WorkersAIVAD,pcmToWavprerollMs,vadThreshold,vadPushbackSeconds,vadRetryMs,minAudioBytesoptionsVoiceInputAgentOptionstypebeforeTranscribehook (audio is fed continuously, not in batches)vad_msandstt_msfrom pipeline metrics- Hibernation support (
withVoiceandwithVoiceInputnow requireAgent, not partyserverServer)