New Features
Evals & Testing:
You can now perform turn-by-turn evaluations on your agent interactions. Here's an example of how to validate expected behaviors:
result = await sess.run(user_input="Can I book an appointment? What's your availability for the next two weeks?")
result.expect.skip_next_event_if(type="message", role="assistant")
result.expect.next_event().is_function_call(name="list_available_slots")
result.expect.next_event().is_function_call_output()
await result.expect.next_event().is_message(role="assistant").judge(llm, intent="must confirm no availability")
Check out these practical examples: drive-thru, frontdesk
Documentation: https://docs.livekit.io/agents/build/testing/
Preemptive Generation
This feature enables speculative initiation of LLM and TTS processing before the user's turn concludes, significantly reducing response latency by overlapping processing with user audio. Disabled by default:
session = AgentSession(..., preemptive_generation=True)
Enhanced End-of-Turn (EOU) Detection
The end-of-turn model has been refined to reduce sensitivity to punctuation and better handle multilingual scenarios, notably improving Hindi language support.
'
Documentation: https://docs.livekit.io/agents/build/turns/turn-detector/#supported-languages
OpenTelemetry Integration
Agent now supports tracing for LLM/TTS requests and user callbacks using OpenTelemetry. See LangFuse example for detailed implementation.
Experimental Agent Tasks
AgentTask is a new experimental subset feature allowing agents to terminate upon achieving specific goals. You can await AgentTasks directly in your workflows:
@function_tool
async def schedule_appointment(self, ctx: RunContext[Userdata], slot_id: str) -> str:
# Attempts to retrieve user email, allowing multiple agent-user interactions
email_result = await beta.workflows.GetEmailTask(chat_ctx=self.chat_ctx)
Half-Duplex Pipeline
Combine Gemini or OpenAI's realtime STT/LLM with a separate TTS engine, optimizing your agent's voice interactions:
session = AgentSession(
llm=openai.realtime.RealtimeModel(modalities=["text"]),
# Alternatively: llm=google.beta.realtime.RealtimeModel(modalities=[Modality.TEXT]),
tts=openai.TTS(voice="ash"),
)
View the complete example.
Documentation: https://docs.livekit.io/agents/integrations/realtime/#separate-tts
Improved Transcription Synchronization
Align transcripts accurately with speech outputs from TTS engines such as Cartesia and 11labs for improved synchronization:
session = AgentSession(..., use_tts_aligned_transcript=True)
Refer to the complete example.
Documentation: https://docs.livekit.io/agents/build/text/#tts-aligned-transcriptions
Upgraded Tokenization Engine
Transitioned to the Blingfire tokenization engine from the previous naive implementation, significantly enhancing handling and accuracy for multiple languages.
Complete changelog
- introduce AgentTask by @theomonnom in #2483
- introduce workflows & GetEmailAgent by @theomonnom in #2498
- drive-thru example by @theomonnom in #2609
- reuse SpeechHandle for all generations inside a single turn by @theomonnom in #2623
- introduce test & eval primitives by @theomonnom in #2662
- evals: add maybe_* utils by @theomonnom in #2681
- evals: better error message for assertions by @theomonnom in #2682
- evals: RunResult final_output on Agent tasks by @theomonnom in #2696
- evals: AgentTask GetEmailAdress tests e.g by @theomonnom in #2697
- allow optional RunResult output_type by @theomonnom in #2698
- evals: add EventRangeAssert utils by @theomonnom in #2699
- add front-desk agent example by @theomonnom in #2724
- fix InlineAgent agent resume on error by @theomonnom in #2730
- add ChatContext.merge & merge inline tasks chat_ctx by @theomonnom in #2731
- better GetEmailAgent instructions by @theomonnom in #2732
- exclude function_call inside ChatContext.merge by @theomonnom in #2733
- add Blingfire tokenizer & use it by default by @theomonnom in #2771
- fix RealtimeModel generate_reply authorization by @theomonnom in #2773
- support timed transcripts from tts by @longcw in #2580
- ignore empty sentence in tts stream adapter by @longcw in #2777
- fix types for agents 1.2 by @longcw in #2778
- fix MockTools type by @longcw in #2781
- fix RunResult order of fnc_call & agent_handoff by @theomonnom in #2782
- fix types by @theomonnom in #2783
- fix tr_input by @theomonnom in #2784
- fix GetEmailAgent instructions by @theomonnom in #2786
- fix blingfire tokenizer test by @longcw in #2785
- support tts with realtime model (audio in, text out) by @longcw in #2628
- fix assistant message order on the RunResult by @theomonnom in #2787
- fix FrontDeskAgent list_available_slots by @theomonnom in #2788
- initial evals for the FrontDesk agent by @theomonnom in #2790
- ignore empty assistant messages by @theomonnom in #2792
- evals: add CI by @theomonnom in #2791
- evals ci: use python 3.12 by @theomonnom in #2793
- fix confirmation/validation ambiguity on GetEmailAgent instructions by @theomonnom in #2794
- punctuation free turn detector by @jeradf in #2717
- frontdesk: ToolError example by @theomonnom in #2808
- evals API improvements by @theomonnom in #2846
- make arguments optional for mock_tools by @theomonnom in #2847
- allow returning Exception inside function tools by @theomonnom in #2848
- add envvar to enable verbose evals logs by @theomonnom in #2849
- preemptive generation before end of user turn by @longcw in #2728
- fix next_event return type by @theomonnom in #2856
- evals: add docstrings to the public API by @theomonnom in #2857
- only print the judge result when verbose is enabled by @theomonnom in #2858
- Add contains_agent_handoff assertion by @bcherry in #2862
- allow editing SpeechHandle allow_interruptions & add RunContext.disallow_interruptions by @theomonnom in #2864
- fix evals test by @theomonnom in #2865
- fix ruff and types by @longcw in #2889
- add opentelemetry trace by @longcw in #2873
- fix unordered user messages by @theomonnom in #2891
- fix livekit-agents 1.2 tests by @theomonnom in #2866
- cleanup & prepare for release by @theomonnom in #2893
- add prometheus by @theomonnom in #2908
- add gen_ai attributes to llm_request by @longcw in #2905
- fix types and aws realtime model by @longcw in #2910
- fix TTS fallback adapter metrics_collected event by @longcw in #2890
- add model property for llm plugins by @longcw in #2914
- nit: mprove drivethru by @theomonnom in #2918
- Removing ctx.connect() from examples by @sascotto in #2909
- expose tokenizer option for cartesia tts by @longcw in #2916
- remove openai prewarm by @theomonnom in #2919
- add tts_audio_duration to usage metrics collection by @Panmax in #2915
- [DRAFT] Add inference process health check endpoint by @alfredguiaugment in #2906
- only check inference process health if started by @theomonnom in #2920
- fix missing field in UsageCollector by @davidzhao in #2929
- fix ruff & bump livekit-agents 1.2.0 by @theomonnom in #2936
New Contributors
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.1.7...livekit-agents@1.2.0