livekit/agents livekit-agents@1.2.2 on GitHub

Note

livekit-agents 1.2 introduced many new features. You can check out the changelog here.

New features

SpeechHandle

Waiting for the playout to finish inside the function tools could lead to deadlocks. In this version, an error will be raised instead. To wait for the assistant's spoken response prior of executing a tool, use RunContext.wait_for_playout.

@function_tool
async def my_function_tool(self, ctx: RunContext):
    await ctx.wait_for_playout() # wait for the assistant's spoken response that started the execution of this tool

False interruption detection

We're now emitting an event when the agent got interrupted, but we didn't receive any transcript. (Likely a false interruption).
This is useful to "re-regenerate" an assistant reply so the agent doesn't seems stuck.

@session.on("agent_false_interruption")
def on_false_interruption(ev: AgentFalseInterruptionEvent):
    session.generate_reply(instructions=ev.extra_instructions or NOT_GIVEN)

Initial conversation recording

We have begun implementing conversation recording directly within the Worker. Currently, it can be accessed using the console subcommand. A future update will provide API to use this in production.

python3 examples/drive-thru/drivethru_agent.py console --record

What's Changed

fix cartesia non-streaming tts by @longcw in #2942
add RecorderIO and --record flag to the console mode by @theomonnom in #2934
chore: remove prometheus database from repository by @mateuszkulpa in #2944
parameterize inference worker init timeout by @levity in #2805
plugins: openai: llm: add support for service_tier by @mike-r-mclaughlin in #2945
fix: upgrade bithuman library to unblock accessing agents by @CathyL0 in #2948
fix duplicated user messages when preemptive generation canceled by @longcw in #2949
fix azure stt update options and add logs for error reason by @longcw in #2954
Explictly calling ctx.connect before wait_for_participant by @sascotto in #2957
azure stt: disable language detection if only one language sepcified by @longcw in #2959
gemini: emit input_speech_started when new generation created by @longcw in #2963
evals: fix realtime model RuntimeError by @theomonnom in #2965
reveri/fix-11labs-error-fstring by @johncDepop in #2964
add RunContext.wait_for_playout and guard against deadlocks by @theomonnom in #2966
feat(realtime_model): correctly emit errors when the response is done by @bml1g12 in #2967
slightly optimize import time by @theomonnom in #2968
increase RoomInput frame_size_ms to 50ms by @theomonnom in #2970
add warning when enabling unprovided input/output sinks by @longcw in #2969
Handle RN format for preconnect mimeType by @davidzhao in #2952
tune vad min_silence_duration and min_endpointing_delay by @longcw in #2953
feat: add anam avatar by @karlson-anam in #2938
fix types for anam avatar plugin by @longcw in #2976
fix 11labs tts when audio is an empty string by @longcw in #2973
support resume agent from a false interruption by @longcw in #2852
feat: add simli avatar with example by @Antonyesk601 in #2923
add simli plugin to ci by @longcw in #2978
remove Resemble from CI by @theomonnom in #2979
clean up avatar example and add retry for datastream io rpc call by @longcw in #2943
expose transcription sync speed to RoomOutputOptions by @longcw in #2984
hume tts: raise error message from the api by @longcw in #2982
io: add input source hierarchy & cleanup by @theomonnom in #2983
fix AgentFalseInterruptedEvent none message by @theomonnom in #2987
rename AgentFalseInterruptedEvent -> AgentFalseInterruptionEvent by @theomonnom in #2988
nit: update AgentFalseInterruptionEvent by @theomonnom in #2989
fix deadlock & session close race by @theomonnom in #2997
wait on_exit before pause scheduling by @longcw in #2996
Gladia STT - add region parameter to gladia stt by @mfernandez-gladia in #2995
fix: upgrade bithuman library version by @CathyL0 in #2998
improve GetEmailTask instructions by @theomonnom in #3002
message should be None when empty by @theomonnom in #3003
ci: enable verbose evals by @theomonnom in #3004
fix sensitive TTS tests by @theomonnom in #3005

New Contributors

@levity made their first contribution in #2805
@CathyL0 made their first contribution in #2948
@ladvoc made their first contribution in #2956
@johncDepop made their first contribution in #2964
@bml1g12 made their first contribution in #2967
@karlson-anam made their first contribution in #2938
@Antonyesk601 made their first contribution in #2923

Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.2.0...livekit-agents@1.2.2