livekit/agents livekit-agents@1.3.3 on GitHub

New Features

Observability

To learn more about the new observability features, check out our full write-up on the LiveKit blog. It walks through how session playback, trace inspection, and synchronized logs streamline debugging for voice agents. Read more here

New CLI

The CLI has been redesigned, and a new text-only mode was added so you can test your agent without using voice.

python3 my_agent.py console --text

You can also now configure both the input device and output device directly through the provided parameters.

python3 my_agent.py console --input-device "AirPods" --output-device "MacBook"

New AgentServer API

We’ve renamed Worker to AgentServer, and you now need to use a decorator to define the entrypoint. All existing functionality remains backward compatible. This change lays the groundwork for upcoming design improvements and new features.

server = AgentServer()

def prewarm(proc: JobProcess): ...
def load(proc: JobProcess): ...

server.setup_fnc = prewarm
server.load_fnc = load

@server.rtc_session(agent_name="my_customer_service_agent")
async def entrypoint(ctx: JobContext): ...

Session Report & on_session_end callback

Use the on_session_end callback to generate a structured SessionReport that the conversation history, events, recording metadata, and the agent’s configuration.

server = AgentServer()

async def on_session_end(ctx: JobContext) -> None:
    report = ctx.make_session_report()
    print(json.dumps(report.to_dict(), indent=2))
    
    chat_history = report.chat_history
    # Do post-processing on your session (e.g final evaluations, generate a summary, ...)

@server.rtc_session(on_session_end=on_session_end)
async def my_agent(ctx: JobContext) -> None:
    ...

AgentHandoff item

To capture everything that occurred during your session, we added an AgentHandoff item to the ChatContext.

class AgentHandoff(BaseModel):
    ...
    old_agent_id: str | None
    new_agent_id: str

Improved turn detection model

We updated the turn-detection model, resulting in measurable accuracy improvements across most languages. The table below shows the change in tnr@0.993 between versions 0.4.0 and 0.4.1, along with the percentage difference.

This new version also handles special user inputs such as email addresses, street addresses, and phone numbers much more effectively.

514623611-bb709e00-71ca-4b0e-86c4-fd854dcaf51c

TaskGroup

We added TaskGroup, which lets you run multiple tasks concurrently and wait for all of them to finish. This is useful when collecting several pieces of information from a user where the order doesn’t matter, or when the user may revise earlier inputs while continuing the flow.

We’ve also added an example that uses TaskGroup to build a SurveyAgent, which you can use as a reference.

task_group = TaskGroup()
task_group.add(lambda: GetEmailTask(), id="get_email_task", description="Get the email address")
task_group.add(lambda: GetPhoneNumberTask(), id="phone_number_task", description="Get the phone number")
task_group.add(lambda: GetCreditCardTask(), id="credit_card_task", description="Get credit card")
results = await task_group

IVR systems

Agents can now optionally handle IVR-style interactions. Enabling ivr_detection allows the session to identify and respond appropriately to IVR tones or patterns, and min_endpointing_delay lets you control how long the system waits before ending a turn—useful for menu-style inputs.

session = AgentSession(
    ivr_detection=True,
    min_endpointing_delay=5,
)

llm_node FlushSentinel

We added a FlushSentinel marker that can be yielded from llm_node to flush partial LLM output to TTS and start a new TTS stream. This lets you emit a short, early response (for example, when a specific tool call is detected) while the main LLM response continues in the background. For a concrete pattern, see the flush_llm_node.py example.

async def llm_node(self, chat_ctx: llm.ChatContext, tools: list[llm.FunctionTool], model_settings: ModelSettings) -> AsyncIterable[llm.ChatChunk | FlushSentinel]:
    yield "This is the first sentence"
    yield FlushSentinel()
    yield "Another TTS generation"

What's Changed

feat: new CLI & new AgentServer API by @theomonnom in #3199
remove unused code & fix ServerEnvOption by @theomonnom in #3220
remove custom excepthook by @theomonnom in #3221
fix python 3.9 by @theomonnom in #3222
fix invalid LogLevel on the CLI by @theomonnom in #3292
add Agent.id by @theomonnom in #3478
add AgentHandoff chat item by @theomonnom in #3479
Add AgentHandoff to the chat_ctx & AgentSessionReport by @theomonnom in #3541
fix cli readchar by @theomonnom in #3542
fix RecorderIO av.error.MemoryError by @theomonnom in #3543
fix record & save to tempfile by @theomonnom in #3544
save session json report when --record is enabled by @theomonnom in #3572
brianyin/agt-1947-automatically-parse-dtmf-input-from-users by @toubatbrian in #3512
ingest data to cloud by @theomonnom in #3609
fix Audio/Video input source attach by @theomonnom in #3615
Allow Recording Verbal DTMF Input when ask_confirmation is turned off by @toubatbrian in #3607
Agent IVR System Example by @toubatbrian in #3610
add ChatContext.summarize by @theomonnom in #3660
Gather DTMF Minor Bug Fix by @toubatbrian in #3672
brianyin/agt-2076-support-repeat-instruction-in-dtmf-gathering by @toubatbrian in #3674
rename assistant to agent by @theomonnom in #3690
TaskGroup by @tinalenguyen in #3680
ignore on_enter on GetEmailTask by @theomonnom in #3691
Refactor mock session utilities into a separate file by @toubatbrian in #3692
fix _MetadataLogProcessor by @tinalenguyen in #3697
add Created-At header for the audio recording by @theomonnom in #3698
fix tool validation by @tinalenguyen in #3699
use otel logger for the chat_history by @theomonnom in #3700
Support Agent Session Tools by @toubatbrian in #3707
add extra instructions + tools params into GetEmailTask by @tinalenguyen in #3711
format transcript logs by @paulwe in #3708
add participant attributes to traces by @theomonnom in #3725
fix duplicate agent_session span by @theomonnom in #3726
fix chat_history upload by @theomonnom in #3728
rename realtime_session to rtc_session by @theomonnom in #3729
add backward compatibility by @theomonnom in #3730
add missing options attr to session start log by @paulwe in #3731
brian/dtmf-send-tool by @toubatbrian in #3656
log potential thread leaks preventing process from exiting by @theomonnom in #3744
check room connection state + rename to on_emit + taskgroup fix by @tinalenguyen in #3738
add survey agent example by @tinalenguyen in #3681
update examples to use AgentServer by @tinalenguyen in #3767
allow multiple ids for out of scope by @tinalenguyen in #3789
add chat_history json to the report upload by @theomonnom in #3799
set log timestamps for chat history by @paulwe in #3800
check recorder_io in make_session_report by @tinalenguyen in #3805
feat(cartesia): add LiveKit user agent to requests by @mi-yu in #3809
Add Speechmatics TTS by @aaronng91 in #3754
built-in GetAddressTask by @tinalenguyen in #3807
fix extra instructions param and update confirm_address docstring by @tinalenguyen in #3810
Add support for using a previous silero vad model file by @zaheerabbas-prodigal in #3779
allow updating the same agent that is running to apply changes in agent by @longcw in #3814
chore: fix ruff & formatting by @davidzhao in #3827
fix type checking for agents 1.3 by @longcw in #3842
fix: correct base64 data handling in image content conversion #3867 by @tarsyang in #3868
fix observability by @davidzhao in #3828
avoid rotating transcription synchronizer twice during detach and attach by @longcw in #3845
fix pickling AgentServer for python 3.9 by @longcw in #3847
add better word alignment for Cartesia by @chenghao-mou in #3876
fix jupyter for agents 1.3 by @longcw in #3877
feat(minimax): comprehensive TTS updates and parameter rename by @zhenyujia23-crypto in #3788
feat(aws): add credentials customization for aws stt by @civilcoder55 in #3840
make sure user away timer is cancelled when session closed by @longcw in #3895
fix duplicate responses from gemini by @tinalenguyen in #3898
support google safety settings by @tinalenguyen in #3815
add audio_frame_size_ms for RoomInputOptions by @longcw in #3899
add new room options by @longcw in #3417
feat(tts): add sample rate option to TTS configuration for rime tts plugin arcana model by @gokuljs in #3910
deepgram plugin: better websocket logs by @jjmaldonis in #3912
Add download location in readme by @chenghao-mou in #3908
chore: move LK env var checks later by @davidzhao in #3920
fix ForkServerContext import by @theomonnom in #3924
add timeout to datastream clear_buffer to avoid deadlock when missing playback finished event by @longcw in #3917
observability cleanup by @davidzhao in #3929
feature: GPT-5.1 support by @c0mpli in #3928
record when the session was started by @davidzhao in #3930
add <3.14 requirement temporarily by @chenghao-mou in #3921
Allow tool role for dummy user message by @chenghao-mou in #3938
add <3.14 requirement temporarily by @chenghao-mou in #3942
skip CI checks for md changes by @chenghao-mou in #3939
AGT-2200 Improve usage collector and metric logging with more details by @chenghao-mou in #3935
feat(cartesia): debug log Cartesia request id on WS connection by @mi-yu in #3940
Allow users to pick BVCTelephony at runtime by @bcherry in #3926
don't use decorators for setup_fnc & load_fnc by @theomonnom in #3945
expose room_io from the AgentSession by @theomonnom in #3946
Added Support for gpt-5.1-chat-latest by @devb-enp in #3932
turn-detection: use v0.4.1-intl by @lwestn in #3941
optimizations for turn detector model size by @davidzhao in #3953
feat: add AvatarTalk integration by @Maelstro in #3139
remove noisy error/warn logs by @theomonnom in #3955
release livekit-agents 1.3.1 by @theomonnom in #3957
better setup_fnc & load_fnc API & fix examples by @theomonnom in #3958
fix types for agents 1.3.1 by @longcw in #3959
livekit-agents 1.3.2 by @theomonnom in #3960
Skip FallbackLLMStream nested duplicate traces by @chenghao-mou in #3934
add flush for llm_node by @longcw in #3933
expose worker_load to prom & /worker endpoint by @theomonnom in #3968
Fix GPT-5.1 reasoning_effort by @ivanpuhachov in #3966
include speech duration in VAD EOS by @jayeshp19 in #3951
revert #2787: execute tools without waiting text generation by @longcw in #3962
livekit-agents 1.3.3 by @theomonnom in #4024

New Contributors

@mi-yu made their first contribution in #3809
@aaronng91 made their first contribution in #3754
@zaheerabbas-prodigal made their first contribution in #3779
@tarsyang made their first contribution in #3868
@chenghao-mou made their first contribution in #3876
@zhenyujia23-crypto made their first contribution in #3788
@civilcoder55 made their first contribution in #3840
@jjmaldonis made their first contribution in #3912
@c0mpli made their first contribution in #3928
@devb-enp made their first contribution in #3932
@Maelstro made their first contribution in #3139
@ivanpuhachov made their first contribution in #3966

Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.2.18...livekit-agents@1.3.3