github pipecat-ai/pipecat v0.0.69

latest releases: v0.0.84, v0.0.83, v0.0.82...
3 months ago

Added

  • Added a new frame FunctionCallsStartedFrame. This frame is pushed both upstream and downstream from the LLM service to indicate that one or more function calls are going to be executed.

  • Added LLM services on_function_calls_started event. This event will be triggered when the LLM service receives function calls from the model and is going to start executing them.

  • Function calls can now be executed sequentially (in the order received in the completion) by passing run_in_parallel=False when creating your LLM service. By default, if the LLM completion returns 2 or more function calls they run concurrently. In both cases, concurrently and sequentially, a new LLM completion will run when the last function call finishes.

  • Added OpenTelemetry tracing for GeminiMultimodalLiveLLMService and OpenAIRealtimeBetaLLMService.

  • Added initial support for interruption strategies, which determine if the user should interrupt the bot while the bot is speaking. Interruption strategies can be based on factors such as audio volume or the number of words spoken by the user. These can be specified via the new interruption_strategies field in PipelineParams. A new MinWordsInterruptionStrategy strategy has been introduced which triggers an interruption if the user has spoken a minimum number of words. If no interruption strategies are specified, the normal interruption behavior applies. If multiple strategies are provided, the first one that evaluates to true will trigger the interruption.

  • BaseInputTransport now handles StopFrame. When a StopFrame is received the transport will pause sending frames downstream until a new StartFrame is received. This allows the transport to be reused (keeping the same connection) in a different pipeline.

  • Updated AssemblyAI STT service to support their latest streaming speech-to-text model with improved transcription latency and endpointing.

  • You can now access STT service results through the new TranscriptionFrame.result and InterimTranscriptionFrame.result field. This is useful in case you use some specific settings for the STT and you want to access the STT results.

  • The examples runner is now public from the pipecat.examples package. This allows everyone to build their own examples and run them easily.

  • It is now possible to push OutputDTMFFrame or OutputDTMFUrgentFrame with DailyTransport. This will be sent properly if a Daily dial-out connection has been established.

  • Added OutputDTMFUrgentFrame to send a DTMF keypress quickly. The previous OutputDTMFFrame queues the keypress with the rest of data frames.

  • Added DTMFAggregator, which aggregates keypad presses into TranscriptionFrames. Aggregation occurs after a timeout, termination key press, or user interruption. You can specify the prefix of the TranscriptionFrame.

  • Added new functions DailyTransport.start_transcription() and DailyTransport.stop_transcription() to be able to start and stop Daily transcription dynamically (maybe with different settings).

Changed

  • Reverted the default model for GeminiMultimodalLiveLLMService back to models/gemini-2.0-flash-live-001. gemini-2.5-flash-preview-native-audio-dialog has inconsistent performance. You can opt in to using this model by setting the model arg.

  • Function calls are now cancelled by default if there's an interruption. To disable this behavior you can set cancel_on_interruption=False when registering the function call. Since function calls are executed as tasks you can tell if a function call has been cancelled by catching the asyncio.CancelledError exception (and don't forget to raise it again!).

  • Updated OpenTelemetry tracing attribute metrics.ttfb_ms to metrics.ttfb. The attribute reports TTFB in seconds.

Deprecated

  • DailyTransport.send_dtmf() is deprecated, push an OutputDTMFFrame or an OutputDTMFUrgentFrame instead.

Fixed

  • Fixed an issue with ElevenLabsTTSService where long responses would continue generating output even after an interruption.

  • Fixed an issue with the OpenAILLMContext where non-Roman characters were being incorrectly encoded as Unicode escape sequences. This was a logging issue and did not impact the actual conversation.

  • In AWSBedrockLLMService, worked around a possible bug in AWS Bedrock where a toolConfig is required if there has been previous tool use in the messages array. This workaround includes a no_op factory function call is used to satisfy the requirement.

  • Fixed WebsocketClientTransport to use FrameProcessorSetup.task_manager instead of StartFrame.task_manager.

Performance

  • Use uvloop as the new event loop on Linux and macOS systems.

Don't miss a new pipecat release

NewReleases is sending notifications on new releases.