livekit/agents livekit-agents@1.2.0 on GitHub

New Features

Evals & Testing:

You can now perform turn-by-turn evaluations on your agent interactions. Here's an example of how to validate expected behaviors:

result = await sess.run(user_input="Can I book an appointment? What's your availability for the next two weeks?")
result.expect.skip_next_event_if(type="message", role="assistant")
result.expect.next_event().is_function_call(name="list_available_slots")
result.expect.next_event().is_function_call_output()
await result.expect.next_event().is_message(role="assistant").judge(llm, intent="must confirm no availability")

Check out these practical examples: drive-thru, frontdesk

Documentation: https://docs.livekit.io/agents/build/testing/

Preemptive Generation

This feature enables speculative initiation of LLM and TTS processing before the user's turn concludes, significantly reducing response latency by overlapping processing with user audio. Disabled by default:

session = AgentSession(..., preemptive_generation=True)

Enhanced End-of-Turn (EOU) Detection

The end-of-turn model has been refined to reduce sensitivity to punctuation and better handle multilingual scenarios, notably improving Hindi language support.
'
Documentation: https://docs.livekit.io/agents/build/turns/turn-detector/#supported-languages

OpenTelemetry Integration

Agent now supports tracing for LLM/TTS requests and user callbacks using OpenTelemetry. See LangFuse example for detailed implementation.

Experimental Agent Tasks

AgentTask is a new experimental subset feature allowing agents to terminate upon achieving specific goals. You can await AgentTasks directly in your workflows:

@function_tool
async def schedule_appointment(self, ctx: RunContext[Userdata], slot_id: str) -> str:
    # Attempts to retrieve user email, allowing multiple agent-user interactions
    email_result = await beta.workflows.GetEmailTask(chat_ctx=self.chat_ctx)

Half-Duplex Pipeline

Combine Gemini or OpenAI's realtime STT/LLM with a separate TTS engine, optimizing your agent's voice interactions:

session = AgentSession(
    llm=openai.realtime.RealtimeModel(modalities=["text"]),
    # Alternatively: llm=google.beta.realtime.RealtimeModel(modalities=[Modality.TEXT]),
    tts=openai.TTS(voice="ash"),
)

View the complete example.

Documentation: https://docs.livekit.io/agents/integrations/realtime/#separate-tts

Improved Transcription Synchronization

Align transcripts accurately with speech outputs from TTS engines such as Cartesia and 11labs for improved synchronization:

session = AgentSession(..., use_tts_aligned_transcript=True)

Refer to the complete example.

Documentation: https://docs.livekit.io/agents/build/text/#tts-aligned-transcriptions

Upgraded Tokenization Engine

Transitioned to the Blingfire tokenization engine from the previous naive implementation, significantly enhancing handling and accuracy for multiple languages.

Complete changelog

introduce AgentTask by @theomonnom in #2483
introduce workflows & GetEmailAgent by @theomonnom in #2498
drive-thru example by @theomonnom in #2609
reuse SpeechHandle for all generations inside a single turn by @theomonnom in #2623
introduce test & eval primitives by @theomonnom in #2662
evals: add maybe_* utils by @theomonnom in #2681
evals: better error message for assertions by @theomonnom in #2682
evals: RunResult final_output on Agent tasks by @theomonnom in #2696
evals: AgentTask GetEmailAdress tests e.g by @theomonnom in #2697
allow optional RunResult output_type by @theomonnom in #2698
evals: add EventRangeAssert utils by @theomonnom in #2699
add front-desk agent example by @theomonnom in #2724
fix InlineAgent agent resume on error by @theomonnom in #2730
add ChatContext.merge & merge inline tasks chat_ctx by @theomonnom in #2731
better GetEmailAgent instructions by @theomonnom in #2732
exclude function_call inside ChatContext.merge by @theomonnom in #2733
add Blingfire tokenizer & use it by default by @theomonnom in #2771
fix RealtimeModel generate_reply authorization by @theomonnom in #2773
support timed transcripts from tts by @longcw in #2580
ignore empty sentence in tts stream adapter by @longcw in #2777
fix types for agents 1.2 by @longcw in #2778
fix MockTools type by @longcw in #2781
fix RunResult order of fnc_call & agent_handoff by @theomonnom in #2782
fix types by @theomonnom in #2783
fix tr_input by @theomonnom in #2784
fix GetEmailAgent instructions by @theomonnom in #2786
fix blingfire tokenizer test by @longcw in #2785
support tts with realtime model (audio in, text out) by @longcw in #2628
fix assistant message order on the RunResult by @theomonnom in #2787
fix FrontDeskAgent list_available_slots by @theomonnom in #2788
initial evals for the FrontDesk agent by @theomonnom in #2790
ignore empty assistant messages by @theomonnom in #2792
evals: add CI by @theomonnom in #2791
evals ci: use python 3.12 by @theomonnom in #2793
fix confirmation/validation ambiguity on GetEmailAgent instructions by @theomonnom in #2794
punctuation free turn detector by @jeradf in #2717
frontdesk: ToolError example by @theomonnom in #2808
evals API improvements by @theomonnom in #2846
make arguments optional for mock_tools by @theomonnom in #2847
allow returning Exception inside function tools by @theomonnom in #2848
add envvar to enable verbose evals logs by @theomonnom in #2849
preemptive generation before end of user turn by @longcw in #2728
fix next_event return type by @theomonnom in #2856
evals: add docstrings to the public API by @theomonnom in #2857
only print the judge result when verbose is enabled by @theomonnom in #2858
Add contains_agent_handoff assertion by @bcherry in #2862
allow editing SpeechHandle allow_interruptions & add RunContext.disallow_interruptions by @theomonnom in #2864
fix evals test by @theomonnom in #2865
fix ruff and types by @longcw in #2889
add opentelemetry trace by @longcw in #2873
fix unordered user messages by @theomonnom in #2891
fix livekit-agents 1.2 tests by @theomonnom in #2866
cleanup & prepare for release by @theomonnom in #2893
add prometheus by @theomonnom in #2908
add gen_ai attributes to llm_request by @longcw in #2905
fix types and aws realtime model by @longcw in #2910
fix TTS fallback adapter metrics_collected event by @longcw in #2890
add model property for llm plugins by @longcw in #2914
nit: mprove drivethru by @theomonnom in #2918
Removing ctx.connect() from examples by @sascotto in #2909
expose tokenizer option for cartesia tts by @longcw in #2916
remove openai prewarm by @theomonnom in #2919
add tts_audio_duration to usage metrics collection by @Panmax in #2915
[DRAFT] Add inference process health check endpoint by @alfredguiaugment in #2906
only check inference process health if started by @theomonnom in #2920
fix missing field in UsageCollector by @davidzhao in #2929
fix ruff & bump livekit-agents 1.2.0 by @theomonnom in #2936

New Contributors

@sascotto made their first contribution in #2909

Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.1.7...livekit-agents@1.2.0