pipecat-ai/pipecat v0.0.55 on GitHub

Added

Added a new start_metadata field to PipelineParams. The provided metadata will be set to the initial StartFrame being pushed from the PipelineTask.
Added new fields to PipelineParams to control audio input and output sample rates for the whole pipeline. This allows controlling sample rates from a single place instead of having to specify sample rates in each service. Setting a sample rate to a service is still possible and will override the value from PipelineParams.
Introduce audio resamplers (BaseAudioResampler). This is just a base class to implement audio resamplers. Currently, two implementations are provided SOXRAudioResampler and ResampyResampler. A new create_default_resampler() has been added (replacing the now deprecated resample_audio()).
It is now possible to specify the asyncio event loop that a PipelineTask and all the processors should run on by passing it as a new argument to the PipelineRunner. This could allow running pipelines in multiple threads each one with its own event loop.
Added a new utils.TaskManager. Instead of a global task manager we now have a task manager per PipelineTask. In the previous version the task manager was global, so running multiple simultaneous PipelineTasks could result in dangling task warnings which were not actually true. In order, for all the processors to know about the task manager, we pass it through the StartFrame. This means that processors should create tasks when they receive a StartFrame but not before (because they don't have a task manager yet).
Added TelnyxFrameSerializer to support Telnyx calls. A full running example has also been added to examples/telnyx-chatbot.
Allow pushing silence audio frames before TTSStoppedFrame. This might be useful for testing purposes, for example, passing bot audio to an STT service which usually needs additional audio data to detect the utterance stopped.
TwilioSerializer now supports transport message frames. With this we can create Twilio emulators.
Added a new transport: WebsocketClientTransport.
Added a metadata field to Frame which makes it possible to pass custom data to all frames.
Added test/utils.py inside of pipecat package.

Changed

GatedOpenAILLMContextAggregator now require keyword arguments. Also, a new start_open argument has been added to set the initial state of the gate.
Added organization and project level authentication to OpenAILLMService.
Improved the language checking logic in ElevenLabsTTSService and ElevenLabsHttpTTSService to properly handle language codes based on model compatibility, with appropriate warnings when language codes cannot be applied.
Updated GoogleLLMContext to support pushing LLMMessagesUpdateFrames that contain a combination of function calls, function call responses, system messages, or just messages.
InputDTMFFrame is now based on DTMFFrame. There's also a new OutputDTMFFrame frame.

Deprecated

resample_audio() is now deprecated, use create_default_resampler() instead.

Removed

AudioBufferProcessor.reset_audio_buffers() has been removed, use AudioBufferProcessor.start_recording() and AudioBufferProcessor.stop_recording() instead.

Fixed

Fixed a AudioBufferProcessor that would cause crackling in some recordings.
Fixed an issue in AudioBufferProcessor where user callback would not be called on task cancellation.
Fixed an issue in AudioBufferProcessor that would cause wrong silence padding in some cases.
Fixed an issue where ElevenLabsTTSService messages would return a 1009 websocket error by increasing the max message size limit to 16MB.
Fixed a DailyTransport issue that would cause events to be triggered before join finished.
Fixed a PipelineTask issue that was preventing processors to be cleaned up after cancelling the task.
Fixed an issue where queuing a CancelFrame to a pipeline task would not cause the task to finish. However, using PipelineTask.cancel() is still the recommended way to cancel a task.

Other

Improved Unit Test run_test() to use PipelineTask and PipelineRunner. There's now also some control around StartFrame and EndFrame. The EndTaskFrame has been removed since it doesn't seem necessary with this new approach.
Updated twilio-chatbot with a few new features: use 8000 sample rate and avoid resampling, a new client useful for stress testing and testing locally without the need to make phone calls. Also, added audio recording on both the client and the server to make sure the audio sounds good.
Updated examples to use task.cancel() to immediately exit the example when a participant leaves or disconnects, instead of pushing an EndFrame. Pushing an EndFrame causes the bot to run through everything that is internally queued (which could take some seconds). Note that using task.cancel() might not always be the best option and pushing an EndFrame could still be desirable to make sure all the pipeline is flushed.