github pipecat-ai/pipecat v0.0.58

13 hours ago

Added

  • Added track-specific audio event on_track_audio_data to AudioBufferProcessor for accessing separate input and output audio tracks.

  • Pipecat version will now be logged on every application startup. This will help us identify what version we are running in case of any issues.

  • Added a new StopFrame which can be used to stop a pipeline task while keeping the frame processors running. The frame processors could then be used in a different pipeline. The difference between a StopFrame and a StopTaskFrame is that, as with EndFrame and EndTaskFrame, the StopFrame is pushed from the task and the StopTaskFrame is pushed upstream inside the pipeline by any processor.

  • Added a new PipelineTask parameter observers that replaces the previous PipelineParams.observers.

  • Added a new PipelineTask parameter check_dangling_tasks to enable or disable checking for frame processors' dangling tasks when the Pipeline finishes running.

  • Added new on_completion_timeout event for LLM services (all OpenAI-based services, Anthropic and Google). Note that this event will only get triggered if LLM timeouts are setup and if the timeout was reached. It can be useful to retrigger another completion and see if the timeout was just a blip.

  • Added new log observers LLMLogObserver and TranscriptionLogObserver that can be useful for debugging your pipelines.

  • Added room_url property to DailyTransport.

  • Added addons argument to DeepgramSTTService.

  • Added exponential_backoff_time() to utils.network module.

Changed

  • ⚠️ PipelineTask now requires keyword arguments (except for the first one for the pipeline).

  • Updated PlayHTHttpTTSService to take a voice_engine and protocol input in the constructor. The previous method of providing a voice_engine input that contains the engine and protocol is deprecated by PlayHT.

  • The base TTSService class now strips leading newlines before sending text to the TTS provider. This change is to solve issues where some TTS providers, like Azure, would not output text due to newlines.

  • GrokLLMSService now uses grok-2 as the default model.

  • AnthropicLLMService now uses claude-3-7-sonnet-20250219 as the default model.

  • RimeHttpTTSService needs an aiohttp.ClientSession to be passed to the constructor as all the other HTTP-based services.

  • RimeHttpTTSService doesn't use a default voice anymore.

  • DeepgramSTTService now uses the new nova-3 model by default. If you want to use the previous model you can pass LiveOptions(model="nova-2-general").
    (see https://deepgram.com/learn/introducing-nova-3-speech-to-text-api)

stt = DeepgramSTTService(..., live_options=LiveOptions(model="nova-2-general"))

Deprecated

  • PipelineParams.observers is now deprecated, you the new PipelineTask parameter observers.

Removed

  • Remove TransportParams.audio_out_is_live since it was not being used at all.

Fixed

  • Fixed a GoogleLLMService that was causing an exception when sending inline audio in some cases.

  • Fixed an AudioContextWordTTSService issue that would cause an EndFrame to disconnect from the TTS service before audio from all the contexts was received. This affected services like Cartesia and Rime.

  • Fixed an issue that was not allowing to pass an OpenAILLMContext to create GoogleLLMService's context aggregators.

  • Fixed a ElevenLabsTTSService, FishAudioTTSService, LMNTTTSService and PlayHTTTSService issue that was resulting in audio requested before an interruption being played after an interruption.

  • Fixed match_endofsentence support for ellipses.

  • Fixed an issue that would cause undesired interruptions via EmulateUserStartedSpeakingFrame when only interim transcriptions (i.e. no final transcriptions) where received.

  • Fixed an issue where EndTaskFrame was not triggering on_client_disconnected or closing the WebSocket in FastAPI.

  • Fixed an issue in DeepgramSTTService where the sample_rate passed to the LiveOptions was not being used, causing the service to use the default sample rate of pipeline.

  • Fixed a context aggregator issue that would not append the LLM text response to the context if a function call happened in the same LLM turn.

  • Fixed an issue that was causing HTTP TTS services to push TTSStoppedFrame more than once.

  • Fixed a FishAudioTTSService issue where TTSStoppedFrame was not being pushed.

  • Fixed an issue that start_callback was not invoked for some LLM services.

  • Fixed an issue that would cause DeepgramSTTService to stop working after an error occurred (e.g. sudden network loss). If the network recovered we would not reconnect.

  • Fixed a STTMuteFilter issue that would not mute user audio frames causing transcriptions to be generated by the STT service.

Other

  • Added Gemini support to examples/phone-chatbot.

  • Added foundational example 34-audio-recording.py showing how to use the AudioBufferProcessor callbacks to save merged and track recordings.

Don't miss a new pipecat release

NewReleases is sending notifications on new releases.