New Features
Observability
To learn more about the new observability features, check out our full write-up on the LiveKit blog. It walks through how session playback, trace inspection, and synchronized logs streamline debugging for voice agents. Read more here
New CLI
The CLI has been redesigned, and a new text-only mode was added so you can test your agent without using voice.
python3 my_agent.py console --text
You can also now configure both the input device and output device directly through the provided parameters.
python3 my_agent.py console --input-device "AirPods" --output-device "MacBook"
New AgentServer API
We’ve renamed Worker to AgentServer, and you now need to use a decorator to define the entrypoint. All existing functionality remains backward compatible. This change lays the groundwork for upcoming design improvements and new features.
server = AgentServer()
def prewarm(proc: JobProcess): ...
def load(proc: JobProcess): ...
server.setup_fnc = prewarm
server.load_fnc = load
@server.rtc_session(agent_name="my_customer_service_agent")
async def entrypoint(ctx: JobContext): ...Session Report & on_session_end callback
Use the on_session_end callback to generate a structured SessionReport that the conversation history, events, recording metadata, and the agent’s configuration.
server = AgentServer()
async def on_session_end(ctx: JobContext) -> None:
report = ctx.make_session_report()
print(json.dumps(report.to_dict(), indent=2))
chat_history = report.chat_history
# Do post-processing on your session (e.g final evaluations, generate a summary, ...)
@server.rtc_session(on_session_end=on_session_end)
async def my_agent(ctx: JobContext) -> None:
...AgentHandoff item
To capture everything that occurred during your session, we added an AgentHandoff item to the ChatContext.
class AgentHandoff(BaseModel):
...
old_agent_id: str | None
new_agent_id: strImproved turn detection model
We updated the turn-detection model, resulting in measurable accuracy improvements across most languages. The table below shows the change in tnr@0.993 between versions 0.4.0 and 0.4.1, along with the percentage difference.
This new version also handles special user inputs such as email addresses, street addresses, and phone numbers much more effectively.
TaskGroup
We added TaskGroup, which lets you run multiple tasks concurrently and wait for all of them to finish. This is useful when collecting several pieces of information from a user where the order doesn’t matter, or when the user may revise earlier inputs while continuing the flow.
We’ve also added an example that uses TaskGroup to build a SurveyAgent, which you can use as a reference.
task_group = TaskGroup()
task_group.add(lambda: GetEmailTask(), id="get_email_task", description="Get the email address")
task_group.add(lambda: GetPhoneNumberTask(), id="phone_number_task", description="Get the phone number")
task_group.add(lambda: GetCreditCardTask(), id="credit_card_task", description="Get credit card")
results = await task_groupIVR systems
Agents can now optionally handle IVR-style interactions. Enabling ivr_detection allows the session to identify and respond appropriately to IVR tones or patterns, and min_endpointing_delay lets you control how long the system waits before ending a turn—useful for menu-style inputs.
session = AgentSession(
ivr_detection=True,
min_endpointing_delay=5,
)llm_node FlushSentinel
We added a FlushSentinel marker that can be yielded from llm_node to flush partial LLM output to TTS and start a new TTS stream. This lets you emit a short, early response (for example, when a specific tool call is detected) while the main LLM response continues in the background. For a concrete pattern, see the flush_llm_node.py example.
async def llm_node(self, chat_ctx: llm.ChatContext, tools: list[llm.FunctionTool], model_settings: ModelSettings) -> AsyncIterable[llm.ChatChunk | FlushSentinel]:
yield "This is the first sentence"
yield FlushSentinel()
yield "Another TTS generation"What's Changed
- feat: new CLI & new AgentServer API by @theomonnom in #3199
- remove unused code & fix ServerEnvOption by @theomonnom in #3220
- remove custom excepthook by @theomonnom in #3221
- fix python 3.9 by @theomonnom in #3222
- fix invalid
LogLevelon the CLI by @theomonnom in #3292 - add
Agent.idby @theomonnom in #3478 - add
AgentHandoffchat item by @theomonnom in #3479 - Add
AgentHandoffto the chat_ctx & AgentSessionReport by @theomonnom in #3541 - fix cli
readcharby @theomonnom in #3542 - fix
RecorderIOav.error.MemoryError by @theomonnom in #3543 - fix record & save to tempfile by @theomonnom in #3544
- save session json report when
--recordis enabled by @theomonnom in #3572 - brianyin/agt-1947-automatically-parse-dtmf-input-from-users by @toubatbrian in #3512
- ingest data to cloud by @theomonnom in #3609
- fix Audio/Video input source attach by @theomonnom in #3615
- Allow Recording Verbal DTMF Input when ask_confirmation is turned off by @toubatbrian in #3607
- Agent IVR System Example by @toubatbrian in #3610
- add
ChatContext.summarizeby @theomonnom in #3660 - Gather DTMF Minor Bug Fix by @toubatbrian in #3672
- brianyin/agt-2076-support-repeat-instruction-in-dtmf-gathering by @toubatbrian in #3674
- rename
assistanttoagentby @theomonnom in #3690 - TaskGroup by @tinalenguyen in #3680
- ignore on_enter on GetEmailTask by @theomonnom in #3691
- Refactor mock session utilities into a separate file by @toubatbrian in #3692
- fix _MetadataLogProcessor by @tinalenguyen in #3697
- add Created-At header for the audio recording by @theomonnom in #3698
- fix tool validation by @tinalenguyen in #3699
- use otel logger for the chat_history by @theomonnom in #3700
- Support Agent Session Tools by @toubatbrian in #3707
- add extra instructions + tools params into GetEmailTask by @tinalenguyen in #3711
- format transcript logs by @paulwe in #3708
- add participant attributes to traces by @theomonnom in #3725
- fix duplicate
agent_sessionspan by @theomonnom in #3726 - fix chat_history upload by @theomonnom in #3728
- rename
realtime_sessiontortc_sessionby @theomonnom in #3729 - add backward compatibility by @theomonnom in #3730
- add missing options attr to session start log by @paulwe in #3731
- brian/dtmf-send-tool by @toubatbrian in #3656
- log potential thread leaks preventing process from exiting by @theomonnom in #3744
- check room connection state + rename to on_emit + taskgroup fix by @tinalenguyen in #3738
- add survey agent example by @tinalenguyen in #3681
- update examples to use AgentServer by @tinalenguyen in #3767
- allow multiple ids for out of scope by @tinalenguyen in #3789
- add chat_history json to the report upload by @theomonnom in #3799
- set log timestamps for chat history by @paulwe in #3800
- check recorder_io in make_session_report by @tinalenguyen in #3805
- feat(cartesia): add LiveKit user agent to requests by @mi-yu in #3809
- Add Speechmatics TTS by @aaronng91 in #3754
- built-in GetAddressTask by @tinalenguyen in #3807
- fix extra instructions param and update confirm_address docstring by @tinalenguyen in #3810
- Add support for using a previous silero vad model file by @zaheerabbas-prodigal in #3779
- allow updating the same agent that is running to apply changes in agent by @longcw in #3814
- chore: fix ruff & formatting by @davidzhao in #3827
- fix type checking for agents 1.3 by @longcw in #3842
- fix: correct base64 data handling in image content conversion #3867 by @tarsyang in #3868
- fix observability by @davidzhao in #3828
- avoid rotating transcription synchronizer twice during detach and attach by @longcw in #3845
- fix pickling AgentServer for python 3.9 by @longcw in #3847
- add better word alignment for Cartesia by @chenghao-mou in #3876
- fix jupyter for agents 1.3 by @longcw in #3877
- feat(minimax): comprehensive TTS updates and parameter rename by @zhenyujia23-crypto in #3788
- feat(aws): add credentials customization for aws stt by @civilcoder55 in #3840
- make sure user away timer is cancelled when session closed by @longcw in #3895
- fix duplicate responses from gemini by @tinalenguyen in #3898
- support google safety settings by @tinalenguyen in #3815
- add audio_frame_size_ms for RoomInputOptions by @longcw in #3899
- add new room options by @longcw in #3417
- feat(tts): add sample rate option to TTS configuration for rime tts plugin arcana model by @gokuljs in #3910
- deepgram plugin: better websocket logs by @jjmaldonis in #3912
- Add download location in readme by @chenghao-mou in #3908
- chore: move LK env var checks later by @davidzhao in #3920
- fix ForkServerContext import by @theomonnom in #3924
- add timeout to datastream clear_buffer to avoid deadlock when missing playback finished event by @longcw in #3917
- observability cleanup by @davidzhao in #3929
- feature: GPT-5.1 support by @c0mpli in #3928
- record when the session was started by @davidzhao in #3930
- add <3.14 requirement temporarily by @chenghao-mou in #3921
- Allow tool role for dummy user message by @chenghao-mou in #3938
- add <3.14 requirement temporarily by @chenghao-mou in #3942
- skip CI checks for md changes by @chenghao-mou in #3939
- AGT-2200 Improve usage collector and metric logging with more details by @chenghao-mou in #3935
- feat(cartesia): debug log Cartesia request id on WS connection by @mi-yu in #3940
- Allow users to pick BVCTelephony at runtime by @bcherry in #3926
- don't use decorators for setup_fnc & load_fnc by @theomonnom in #3945
- expose
room_iofrom theAgentSessionby @theomonnom in #3946 - Added Support for gpt-5.1-chat-latest by @devb-enp in #3932
- turn-detection: use v0.4.1-intl by @lwestn in #3941
- optimizations for turn detector model size by @davidzhao in #3953
- feat: add AvatarTalk integration by @Maelstro in #3139
- remove noisy error/warn logs by @theomonnom in #3955
- release livekit-agents 1.3.1 by @theomonnom in #3957
- better setup_fnc & load_fnc API & fix examples by @theomonnom in #3958
- fix types for agents 1.3.1 by @longcw in #3959
- livekit-agents 1.3.2 by @theomonnom in #3960
- Skip FallbackLLMStream nested duplicate traces by @chenghao-mou in #3934
- add flush for llm_node by @longcw in #3933
- expose worker_load to prom & /worker endpoint by @theomonnom in #3968
- Fix GPT-5.1 reasoning_effort by @ivanpuhachov in #3966
- include speech duration in VAD EOS by @jayeshp19 in #3951
- revert #2787: execute tools without waiting text generation by @longcw in #3962
- livekit-agents 1.3.3 by @theomonnom in #4024
New Contributors
- @mi-yu made their first contribution in #3809
- @aaronng91 made their first contribution in #3754
- @zaheerabbas-prodigal made their first contribution in #3779
- @tarsyang made their first contribution in #3868
- @chenghao-mou made their first contribution in #3876
- @zhenyujia23-crypto made their first contribution in #3788
- @civilcoder55 made their first contribution in #3840
- @jjmaldonis made their first contribution in #3912
- @c0mpli made their first contribution in #3928
- @devb-enp made their first contribution in #3932
- @Maelstro made their first contribution in #3139
- @ivanpuhachov made their first contribution in #3966
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.2.18...livekit-agents@1.3.3
