Added
-
Added
RTVIProcessor
which implements the RTVI-AI standard.
See https://github.com/rtvi-ai -
Added
BotInterruptionFrame
which allows interrupting the bot while talking. -
Added
LLMMessagesAppendFrame
which allows appending messages to the current LLM context. -
Added
LLMMessagesUpdateFrame
which allows changing the LLM context for the one provided in this new frame. -
Added
LLMModelUpdateFrame
which allows updating the LLM model. -
Added
TTSSpeakFrame
which causes the bot say some text. This text will not be part of the LLM context. -
Added
TTSVoiceUpdateFrame
which allows updating the TTS voice.
Removed
- We remove the
LLMResponseStartFrame
andLLMResponseEndFrame
frames. These were added in the past to properly handle interruptions for theLLMAssistantContextAggregator
. But theLLMContextAggregator
is now based onLLMResponseAggregator
which handles interruptions properly by just processing theStartInterruptionFrame
, so there's no need for these extra frames any more.
Fixed
-
Fixed an issue with
StatelessTextTransformer
where it was pushing a string instead of aTextFrame
. -
TTSService
end of sentence detection has been improved. It now works with acronyms, numbers, hours and others. -
Fixed an issue in
TTSService
that would not properly flush the current aggregated sentence if anLLMFullResponseEndFrame
was found.
Performance
CartesiaTTSService
now uses websockets which improves speed. It also leverages the new Cartesia contexts which maintains generated audio prosody when multiple inputs are sent, therefore improving audio quality a lot.