Added
- Added
SpeechControlParamsFrame
, a newSystemFrame
that notifies
downstream processors of the VAD and Turn analyzer params. This frame is
pushed by theBaseInputTransport
at Start and any time a
VADParamsUpdateFrame
is received.
Changed
- Two package dependencies have been updated:
numpy
now supports 1.26.0 and newertransformers
now supports 4.48.0 and newer
Fixed
-
Fixed an issue with RTVI's handling of
append-to-context
. -
Fixed an issue where using audio input with a sample rate requiring resampling
could result in empty audio being passed to STT services, causing errors. -
Fixed the VAD analyzer to process the full audio buffer as long as it contains
more than the minimum required bytes per iteration, instead of only analyzing
the first chunk. -
Fixed an issue in ParallelPipeline that caused errors when attempting to drain
the queues. -
Fixed an issue with emulated VAD timeout inconsistency in
LLMUserContextAggregator
. Previously, emulated VAD scenarios (where
transcription is received without VAD detection) used a hardcoded
aggregation_timeout
(default 0.5s) instead of matching the VAD's
stop_secs
parameter (default 0.8s). This created different user experiences
between real VAD and emulated VAD scenarios. Now, emulated VAD timeouts
automatically synchronize with the VAD'sstop_secs
parameter. -
Fixed a pipeline freeze when using AWS Nova Sonic, which would occur if the
user started early, while the bot was still working through
trigger_assistant_response()
.