This release has been yanked due to resampling issues affecting audio output
quality and critical bugs impacting ParallelPipelines functionality.
Please upgrade to version 0.0.76 or later.
Added
-
Added a new STT service,
SpeechmaticsSTTService. This service provides real-time speech-to-text transcription using the Speechmatics API. It supports partial and final transcriptions, multiple languages, various audio formats, and speaker diarization. -
Added
normalizeandmodel_idtoFishAudioTTSService. -
Added
http_optionsargument toGoogleLLMService. -
Added
run_llmfield toLLMMessagesAppendFrameandLLMMessagesUpdateFrameframes. If true, a context frame will be pushed triggering the LLM to respond. -
Added a new
SOXRStreamAudioResamplerfor processing audio in chunks or streams. If you write your own processor and need to use an audio resampler, use the newcreate_stream_resampler(). -
Added new
DailyParams.audio_in_user_tracksto allow receiving one track per user (default) or a single track from the room (all participants mixed). -
Added support for providing "direct" functions, which don't need an accompanying
FunctionSchemaor function definition dict. Instead, metadata (i.e.name,description,properties, andrequired) are automatically extracted from a combination of the function signature and docstring.Usage:
# "Direct" function # `params` must be the first parameter async def do_something(params: FunctionCallParams, foo: int, bar: str = ""): """ Do something interesting. Args: foo (int): The foo to do something interesting with. bar (string): The bar to do something interesting with. """ result = await process(foo, bar) await params.result_callback({"result": result}) # ... llm.register_direct_function(do_something) # ... tools = ToolsSchema(standard_tools=[do_something])
-
user_idis now populated in theTranscriptionFrameandInterimTranscriptionFramewhen using a transport that provides auser_id, likeDailyTransportorLiveKitTransport. -
Added
watchdog_coroutine(). This is a watchdog helper for couroutines. So, if you have a coroutine that is waiting for a result and that takes a long time, you will need to wrap it withwatchdog_coroutine()so the watchdog timers are reset regularly. -
Added
session_tokenparameter toAWSNovaSonicLLMService. -
Added Gemini Multimodal Live File API for uploading, fetching, listing, and deleting files. See
26f-gemini-multimodal-live-files-api.pyfor example usage.
Changed
-
Updated all the services to use the new
SOXRStreamAudioResampler, ensuring smooth transitions and eliminating clicks. -
Upgraded
daily-pythonto 0.19.4. -
Updated
googleoptional dependency to usegoogle-genaiversion1.24.0.
Fixed
-
Fixed an issue where audio would get stuck in the queue when an interrupt occurs during Azure TTS synthesis.
-
Fixed a race condition that occurs in Python 3.10+ where the task could miss the
CancelledErrorand continue running indefinitely, freezing the pipeline. -
Fixed a
AWSNovaSonicLLMServiceissue introduced in 0.0.72.
Deprecated
- In
FishAudioTTSService, deprecatedmodeland replaced withreference_id. This change is to better align with Fish Audio's variable naming and to reduce confusion about what functionality the variable controls.