Added
-
Added
TextAggregationMetricsDatametric measuring the time from the first LLM token to the first complete sentence, representing the latency cost of sentence aggregation in the TTS pipeline.
(PR #3696) -
Added support for using strongly-typed objects instead of dicts for updating service settings at runtime.
Instead of, say:
await task.queue_frame( STTUpdateSettingsFrame(settings={"language": Language.ES}) )
you'd do:
await task.queue_frame( STTUpdateSettingsFrame(delta=DeepgramSTTSettings(language=Language.ES)) )
Each service now vends strongly-typed classes like
DeepgramSTTSettingsrepresenting the service's runtime-updatable settings.
(PR #3714) -
Added support for specifying private endpoints for Azure Speech-to-Text, enabling use in private networks behind firewalls.
(PR #3764) -
Added
LemonSliceTransportandLemonSliceApito support adding real-time LemonSlice Avatars to any Daily room.
(PR #3791) -
Added
output_mediumparameter toAgentInputParamsandOneShotInputParamsin Ultravox service to control initial output medium (text or voice) at call creation time.
(PR #3806) -
Added
TurnMetricsDataas a generic metrics class for turn detection, with e2e processing time measurement.KrispVivaTurnnow emitsTurnMetricsDatawithe2e_processing_time_mstracking the interval from VAD speech-to-silence transition to turn completion.
(PR #3809) -
Added
on_audio_context_interrupted()andon_audio_context_completed()callbacks toAudioContextTTSService. Subclasses can override these to perform provider-specific cleanup instead of overriding_handle_interruption().
(PR #3814) -
Added
on_summary_appliedevent toLLMContextSummarizerfor observability, providing message counts before and after context summarization.
(PR #3855) -
Added
summary_message_templatetoLLMContextSummarizationConfigfor customizing how summaries are formatted when injected into context (e.g., wrapping in XML tags).
(PR #3855) -
Added
summarization_timeouttoLLMContextSummarizationConfig(default 120s) to prevent hung LLM calls from permanently blocking future summarizations.
(PR #3855) -
Added optional
llmfield toLLMContextSummarizationConfigfor routing summarization to a dedicated LLM service (e.g., a cheaper/faster model) instead of the pipeline's primary model.
(PR #3855) -
Add AssemblyAI u3-rt-pro model support with built-in turn detection mode
(PR #3856) -
Added
LLMSummarizeContextFrameto trigger on-demand context summarization from anywhere in the pipeline (e.g. a function call tool). Accepts an optionalconfig: LLMContextSummaryConfigto override summary generation settings per request.
(PR #3863) -
Added
LLMContextSummaryConfig(summary generation params:target_context_tokens,min_messages_after_summary,summarization_prompt) andLLMAutoContextSummarizationConfig(auto-trigger thresholds:max_context_tokens,max_unsummarized_messages, plus a nestedsummary_config). These replace the monolithicLLMContextSummarizationConfig.
(PR #3863) -
Added support for the
speed_alphaparameter to thearcanamodel inRimeTTSService.
(PR #3873) -
Added
ClientConnectedFrame, a newSystemFramepushed by all transports (Daily, LiveKit, FastAPI WebSocket, WebSocket Server, SmallWebRTC, HeyGen, Tavus) when a client connects. Enables observers to track transport readiness timing.
(PR #3881) -
Added
StartupTimingObserverfor measuring how long each processor'sstart()method takes during pipeline startup. Also measures transport readiness — the time fromStartFrameto first client connection — via theon_transport_timing_reportevent.
(PR #3881) -
Added
BotConnectedFramefor SFU transports andon_transport_timing_reportevent toStartupTimingObserverwith bot and client connection timing.
(PR #3881) -
Added optional
directionparameter toPipelineTask.queue_frame()andPipelineTask.queue_frames(), allowing frames to be pushed upstream from the end of the pipeline.
(PR #3883) -
Added
on_latency_breakdownevent toUserBotLatencyObserverproviding per-service TTFB, text aggregation, user turn duration, and function call latency metrics for each user-to-bot response cycle.
(PR #3885) -
Added
on_first_bot_speech_latencyevent toUserBotLatencyObservermeasuring the time from client connection to first bot speech. Anon_latency_breakdownis also emitted for this first speech event.
(PR #3885) -
Added
broadcast_interruption()toFrameProcessor. This method pushes anInterruptionFrameboth upstream and downstream directly from the calling processor, avoiding the round-trip through the pipeline task thatpush_interruption_task_frame_and_wait()required.
(PR #3896)
Changed
-
Added
text_aggregation_modeparameter toTTSServiceand all TTS subclasses with a newTextAggregationModeenum (SENTENCE,TOKEN). All text now flows through text aggregators regardless of mode, enabling pattern detection and tag handling in TOKEN mode.
(PR #3696) -
⚠️ Refactored runtime-updatable service settings to use strongly-typed classes (
TTSSettings,STTSettings,LLMSettings, and service-specific subclasses) instead of plain dicts. Each service's_settingsnow holds these strongly-typed objects. For service maintainers, see changes in COMMUNITY_INTEGRATIONS.md.
(PR #3714) -
Word timestamp support has been moved from
WordTTSServiceintoTTSServicevia a newsupports_word_timestampsparameter. Services that previously extendedWordTTSService,AudioContextWordTTSService, orWebsocketWordTTSServicenow passsupports_word_timestamps=Trueto their parent__init__instead.
(PR #3786) -
Improved Ultravox TTFB measurement accuracy by using VAD speech end time instead of
UserStoppedSpeakingFrametiming.
(PR #3806) -
Aligned
UltravoxRealtimeLLMServiceframe handling with OpenAI/Gemini realtime services: addedInterruptionFramehandling with metrics cleanup, processing metrics at response boundaries, and improved agent transcript handling for both voice and text output modalities.
(PR #3806) -
Updated
OpenAIRealtimeLLMServicedefault model togpt-realtime-1.5.
(PR #3807) -
Added
api_keyparameter toKrispVivaSDKManager,KrispVivaTurn, andKrispVivaFilterfor Krisp SDK v1.6.1+ licensing. Falls back toKRISP_VIVA_API_KEYenvironment variable.
(PR #3809) -
Bumped
nltkminimum version from 3.9.1 to 3.9.3 to resolve a security vulnerability.
(PR #3811) -
ServiceSettingsUpdateFrames are nowUninterruptibleFrames. Generally speaking, you don't want a user interruption to prevent a service setting change from going into effect. Note that you usually don't useServiceSettingsUpdateFramedirectly, you use one of its subclasses:LLMUpdateSettingsFrameTTSUpdateSettingsFrameSTTUpdateSettingsFrame
(PR #3819)
-
Updated context summarization to use
userrole instead ofassistantfor summary messages.
(PR #3855) -
Rename
AssemblyAISTTServiceparametermin_end_of_turn_silence_when_confidentparameter tomin_turn_silence(old name still supported with deprecation warning)
(PR #3856) -
⚠️ Renamed
LLMAssistantAggregatorParamsfields:enable_context_summarization→enable_auto_context_summarizationandcontext_summarization_config→auto_context_summarization_config(now acceptsLLMAutoContextSummarizationConfig). The old names still work with aDeprecationWarningfor one release cycle.
(PR #3863) -
ElevenLabsRealtimeSTTServicenow setsTranscriptionFrame.finalizedtoTruewhen usingCommitStrategy.MANUAL.
(PR #3865) -
Updated numba version pin from == to >=0.61.2
(PR #3868) -
Updated tracing code to use
ServiceSettingsdataclass API (given_fields(), attribute access) instead of dict-style access (.items(),in, subscript).
(PR #3879) -
⚠️ Removed
eventfield andcomplete()method fromInterruptionFrame. Removedeventfield fromInterruptionTaskFrame. These are no longer needed sincebroadcast_interruption()does not require a round-trip completion signal.
(PR #3896) -
Moved
pipecat.services.deepgram.stt_sagemakerandpipecat.services.deepgram.tts_sagemakertopipecat.services.deepgram.sagemaker.sttandpipecat.services.deepgram.sagemaker.tts. The old import paths still work but emit aDeprecationWarning.
(PR #3902)
Deprecated
-
⚠️ Deprecated
aggregate_sentencesparameter onTTSServiceand all TTS subclasses. Usetext_aggregation_mode=TextAggregationMode.SENTENCEortext_aggregation_mode=TextAggregationMode.TOKENinstead.
(PR #3696) -
Deprecated
set_model(),set_voice(), andset_language()on AI services in favor of runtime updates viaTTSUpdateSettingsFrame,STTUpdateSettingsFrame, andLLMUpdateSettingsFrame.⚠️ Note, too, a subtle behavior change in these deprecated methods. Whereas previously only
set_language()caused the service to actually react to the update (e.g. by reconnecting to a remote service so it an pick up the change), now all these methods do. This change was made as part of a refactor making them all work the same way under the hood.
(PR #3714) -
Dict-based
*UpdateSettingsFrame(settings={...})is deprecated in favor of passing typed settings delta objects with*UpdateSettingsFrame(delta={...}).
(PR #3714) -
Deprecated
WordTTSService,WebsocketWordTTSService,AudioContextWordTTSService, andInterruptibleWordTTSService. Use their non-word counterparts withsupports_word_timestamps=Trueinstead:WordTTSService→TTSService(supports_word_timestamps=True)WebsocketWordTTSService→WebsocketTTSService(supports_word_timestamps=True)AudioContextWordTTSService→AudioContextTTSService(supports_word_timestamps=True)InterruptibleWordTTSService→InterruptibleTTSService(supports_word_timestamps=True)
(PR #3786)
-
Deprecated
SmartTurnMetricsDatain favor ofTurnMetricsData.BaseSmartTurnnow emitsTurnMetricsDatadirectly.
(PR #3809) -
Deprecated
LLMContextSummarizationConfig. UseLLMAutoContextSummarizationConfigwith a nestedLLMContextSummaryConfiginstead. The old class emits aDeprecationWarning.
(PR #3863) -
Deprecated
push_interruption_task_frame_and_wait()inFrameProcessor. Usebroadcast_interruption()instead. The old method now delegates tobroadcast_interruption()and logs a deprecation warning.
(PR #3896)
Removed
-
Removed
local-smart-turn-v3optional extra frompyproject.toml. Thetransformersandonnxruntimepackages are now always installed as core dependencies since they are required by the default turn stop strategy,TurnAnalyzerUserTurnStopStrategywhich usesLocalSmartTurnAnalyzerV3.
(PR #3803) -
⚠️ Removed
PlayHTTTSServiceandPlayHTHttpTTSService. PlayHT has been shut down and is no longer available.
(PR #3838)
Fixed
-
Added
LLMSpecificMessagehandling inLLMContextSummarizationUtilto skip provider-specific messages during context summarization.
(PR #3794) -
Treated
response_cancel_not_activeas a non-fatal error in realtime services (OpenAIRealtimeLLMService,GrokRealtimeLLMService,OpenAIRealtimeBetaLLMService) to prevent WebSocket disconnection when cancelling an inactive response.
(PR #3795) -
Fixed Poetry compatibility by inlining
local-smart-turn-v3dependencies (transformers,onnxruntime) into core dependencies instead of using a self-referential extra.
(PR #3803) -
Fixed
SentryMetricsmethod signatures to match updatedFrameProcessorMetricsbase class, resolvingTypeErrorwhen using nstart_time/end_timekeyword arguments.
(PR #3808) -
Fixed STT TTFB metrics not being reported for
SonioxSTTServiceandAWSTranscribeSTTServicedue to missingcan_generate_metrics()override.
(PR #3813) -
Fixed an issue where
AudioContextTTSService-based providers (AsyncAI, ElevenLabs, Inworld, Rime) did not close or clean up their server-side audio contexts after normal speech completion, only on interruption.
(PR #3814) -
Fixed STT TTFB metrics measuring timeout expiry time instead of actual ntranscript arrival time.
(PR #3822) -
Fixed
InterimTranscriptionFrameandTranslationFramebeing unintentionally pushed downstream inLLMUserAggregator. They are now consumed likeTranscriptionFrame.
(PR #3825) -
Fixed misleading "Empty audio frame received for STT service" warnings when using audio filters (e.g.
RNNoiseFilter,KrispVivaFilter,AICFilter) that buffer audio internally.
(PR #3828) -
Fixed issues with
RimeNonJsonTTSServicewhere trailing punctuation is sometimes vocalized
(PR #3837) -
Fixed
TTSSpeakFramenot committing spoken text to the conversation context when used outside of an LLM response (e.g., bot greetings or injected speech).
(PR #3845) -
Removed verbose per-chunk audio logging from
GenesysAudioHookSerializerthat flooded production logs.
(PR #3850) -
Add beta feature warning when using custom prompts with AssemblyAI
(PR #3856) -
Fixed
LocalSmartTurnAnalyzerV3producing incorrect end-of-turn predictions at non-16kHz sample rates (e.g. 8kHz Twilio telephony) by adding automatic resampling to 16kHz before Whisper feature extraction.
(PR #3857) -
Fixed
PipelineTaskdouble-insertingRTVIProcessorinto the frame chain when the user provides both anRTVIProcessorin the pipeline and a customRTVIObserversubclass in observers.
(PR #3867) -
Fixed turn completion instructions being lost when
LLMMessagesUpdateFramereplaces the LLM context. Whenfilter_incomplete_user_turnsis enabled, the turn completion system message is now re-injected after context replacement.
(PR #3888) -
Fixed Azure TTS and STT services silently swallowing cancellation errors (invalid API key, network failures, rate limiting) instead of propagating them as
ErrorFrames to the pipeline.
(PR #3893)
Performance
- Switched
GradiumTTSServicefromInterruptibleWordTTSServicetoAudioContextWordTTSService, eliminating websocket disconnect/reconnect on every interruption by usingclient_req_id-based multiplexing.
(PR #3759)
Other
- Standardized Sarvam STT/TTS User-Agent header handling to consistently send Pipecat SDK identity in websocket requests.
(PR #3886)