1.2.0 (2026-05-22)
Features Added
- Web Search & File Search: Added support for built-in web search and file search tools:
- New item types:
ResponseWebSearchCallItem,ResponseFileSearchCallItem - New server events for web/file search lifecycle (
searching,in_progress,completed) - New models:
ActionFind,ActionOpenPage,ActionSearch,ActionSearchSource,FileSearchResult - New enum values:
ItemType.WEB_SEARCH_CALL,ItemType.FILE_SEARCH_CALL - New
SessionIncludeOptionenum for controlling what data is included in session responses
- New item types:
- MCP (Model Context Protocol) Support: Added comprehensive support for Model Context Protocol integration:
MCPServertool type for defining MCP server configurations with authorization, headers, and approval requirementsMCPToolmodel for representing MCP tool definitions with input schemas and annotationsMCPApprovalTypeenum for controlling approval workflows (never,always, or tool-specific)- New item types for MCP approval and call workflows
- New server events for MCP tool listing, call lifecycle, and approval flows
- Avatar Enhancements:
- Added
AzureAvatarVoiceSyncVoicefor avatar voice sync configuration - Added
ServerEventSessionAvatarSwitchToIdleandServerEventSessionAvatarSwitchToSpeakingevents - Added
ServerEventResponseVideoDeltafor avatar video frame streaming - Added
ClientEventOutputAudioBufferClearandServerEventOutputAudioBufferClearedfor output buffer management - Added
AvatarConfigTypesenum with support forvideo-avatarandphoto-avatartypes - Added
AvatarOutputProtocolenum for avatar streaming protocols (webrtc,websocket) - Added
Scenemodel for controlling avatar zoom, position, rotation, and movement amplitude - Added
output_audit_audiofield toAvatarConfig
- Added
- OpenTelemetry Tracing Support: Added
VoiceLiveInstrumentorfor opt-in OpenTelemetry-based
tracing of VoiceLive WebSocket connections, following Azure SDK and GenAI semantic conventions.- Enable via
AZURE_EXPERIMENTAL_ENABLE_GENAI_TRACING=trueenvironment variable - Content recording controlled by
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT - Comprehensive session-level telemetry: session ID, audio format, first-token latency,
turn count, interruption count, audio bytes sent/received, message size - Response & function call ID tracking for end-to-end tracing
- Agent v2 telemetry with agent identity and configuration tracking
- MCP telemetry with tool call and approval flow tracking
- Enable via
- Agent Session Configuration: Added flattened
connect()keyword arguments for configuring Azure AI Foundry agents
at connection time withagent_name,project_name,agent_version,conversation_id, and more - Transcription Improvements:
- Added
TranscriptionPhraseandTranscriptionWordmodels for detailed transcription data - Added
ServerEventResponseAudioTranscriptAnnotationAddedevent - Added
gpt-4o-transcribe-diarizeandmai-transcribe-1transcription model support
- Added
- Interim Response Configuration: Added
StaticInterimResponseConfigandLlmInterimResponseConfig
for generating interim responses during latency or tool calls - Image Content Support: Added
RequestImageContentPartfor image inputs in conversations - Reasoning Effort Control: Added
reasoning_effortfield withReasoningEffortenum - Response Metadata: Added
metadatafield toResponseandResponseCreateParams - Server Warning Events: Added
ServerEventWarningfor handling non-fatal warnings - Personal Voice Models: Added
DragonHDOmniLatestNeuralandMAI-Voice-1model options - Enhanced OpenAI Voices: Added
marinandcedarvoices toOpenAIVoiceNameenum - Enhanced Azure Personal Voice: Added
custom_lexicon_url,prefer_locales,locale,style,
pitch,rate, andvolumeproperties - Pre-generated Assistant Messages: Added
pre_generated_assistant_messageinResponseCreateParams - Explicit Null Values: Enhanced
RequestSessionto properly serialize explicitly setNonevalues
Breaking Changes
- Removed Foundry Agent Tool classes (
FoundryAgentTool,ResponseFoundryAgentCallItem, etc.) —
use flattened Azure AI Foundry keyword arguments withconnect()instead - Audio Format Values: Changed
OutputAudioFormatenum values to use underscore format
(pcm16_8000hz,pcm16_16000hz) instead of the previous hyphenated values.
This is a breaking change for code that compares, persists, or serializes the raw enum values.
Legacy hyphenated values continue to deserialize for backward compatibility. - Renamed
AvatarConfig.typefield toavatar_typeto avoid conflict with Python's built-intype
Other Changes
- Updated default API version to
2026-04-10