Added
-
VAD parameters can now be dynamicallt updated using the
VADParamsUpdateFrame. -
ErrorFramehas now afatalfield to indicate the bot should exit if a fatal error is pushed upstream (false by default). A newFatalErrorFramethat sets this flag to true has been added. -
AnthropicLLMServicenow supports function calling and initial support for prompt caching.
(see https://www.anthropic.com/news/prompt-caching) -
ElevenLabsTTSServicecan now specify ElevenLabs input parameters such asoutput_format. -
TwilioFrameSerializercan now specify Twilio's and Pipecat's desired sample rates to use. -
Added new
on_participant_updatedevent toDailyTransport. -
Added
DailyRESTHelper.delete_room_by_name()andDailyRESTHelper.delete_room_by_url(). -
Added LLM and TTS usage metrics. Those are enabled when
PipelineParams.enable_usage_metricsis True. -
AudioRawFrames are now pushed downstream from the base output transport. This allows capturing the exact words the bot says by adding an STT service at the end of the pipeline. -
Added new
GStreamerPipelineSource. This processor can generate image or audio frames from a GStreamer pipeline (e.g. reading an MP4 file, and RTP stream or anything supported by GStreamer). -
Added
TransportParams.audio_out_is_live. This flag is False by default and it is useful to indicate we should not synchronize audio with sporadic images. -
Added new
BotStartedSpeakingFrameandBotStoppedSpeakingFramecontrol frames. These frames are pushed upstream and they should wrapBotSpeakingFrame. -
Transports now allow you to register event handlers without decorators.
Changed
-
Support RTVI message protocol 0.1. This includes new messages, support for messages responses, support for actions, configuration, webhooks and a bunch of new cool stuff.
(see https://docs.rtvi.ai/) -
SileroVADdependency is now imported via pip'ssilero-vadpackage. -
ElevenLabsTTSServicenow useseleven_turbo_v2_5model by default. -
BotSpeakingFrameis now a control frame. -
StartFrameis now a control frame similar toEndFrame. -
DeepgramTTSServicenow is more customizable. You can adjust the encoding and sample rate.
Fixed
-
TTSStartFrameandTTSStopFrameare now sent when TTS really starts and stops. This allows for knowing when the bot starts and stops speaking even with asynchronous services (like Cartesia). -
Fixed
AzureSTTServicetranscription frame timestamps. -
Fixed an issue with
DailyRESTHelper.create_room()expirations which would cause this function to stop working after the initial expiration elapsed. -
Improved
EndFrameandCancelFramehandling.EndFrameshould end things gracefully while aCancelFrameshould cancel all running tasks as soon as possible. -
Fixed an issue in
AIServicethat would cause a yieldedNonevalue to be processed. -
RTVI's
bot-readymessage is now sent when the RTVI pipeline is ready and a first participant joins. -
Fixed a
BaseInputTransportissue that was causing incoming system frames to be queued instead of being pushed immediately. -
Fixed a
BaseInputTransportissue that was causing start/stop interruptions incoming frames to not cancel tasks and be processed properly.
Other
-
Added
studypalexample (from to the Cartesia folks!). -
Most examples now use Cartesia.
-
Added examples
foundational/19a-tools-anthropic.py,foundational/19b-tools-video-anthropic.pyandfoundational/19a-tools-togetherai.py. -
Added examples
foundational/18-gstreamer-filesrc.pyandfoundational/18a-gstreamer-videotestsrc.pythat show how to useGStreamerPipelineSource. -
Remove
requestslibrary usage. -
Cleanup examples and use
DailyRESTHelper.