1.1.0-beta.2 (2026-02-19)
Features Added
-
MCP (Model Context Protocol) Support: Added comprehensive support for MCP server integration
- Added
VoiceLiveMcpServerDefinitionfor configuring external MCP servers as tools - Added
MCPApprovalTypeenum for controlling tool execution approval workflows ("always", "never") - Added
VoiceLiveMcpToolclass for representing MCP tool definitions with JSON schemas - Added MCP-specific session update events:
SessionUpdateMcpListToolsInProgress- Tool discovery startedSessionUpdateMcpListToolsCompleted- Tool discovery completedSessionUpdateMcpListToolsFailed- Tool discovery failedSessionUpdateResponseMcpCallArgumentsDelta- Tool call arguments streamingSessionUpdateResponseMcpCallArgumentsDone- Tool call arguments completeSessionUpdateResponseMcpCallInProgress- Tool execution startedSessionUpdateResponseMcpCallCompleted- Tool execution completedSessionUpdateResponseMcpCallFailed- Tool execution failed
- Added MCP-specific response items:
SessionResponseMcpListToolItem- Contains list of available tools from MCP serverSessionResponseMcpCallItem- Represents an MCP tool call with optional approval requestSessionResponseMcpApprovalRequestItem- Human-in-the-loop approval requestSessionResponseMcpApprovalResponseItem- Approval decision response
- MCP configuration properties on
VoiceLiveMcpServerDefinition:ServerLabel- Unique identifier for the MCP serverServerUrl- HTTP endpoint for the MCP serverAuthorization- Optional authorization header valueHeaders- Custom HTTP headers for MCP server requestsAllowedTools- Optional list of tool names to enable (whitelist)RequireApproval- Approval policy for tool execution
- Added
-
Enhanced Avatar Configuration: Extended avatar capabilities with new configuration options
- Added
AvatarConfigTypesenum for avatar type selection ("photo_avatar", "video_avatar") - Added
PhotoAvatarBaseModesenum for photo avatar model selection ("vasa1") - Added
AvatarOutputProtocolenum for output protocol selection ("webrtc", "websocket") - New properties on
AvatarConfiguration:Type- Avatar type (photo or video)Model- Base model for photo avatarsOutputProtocol- Protocol for avatar data streaming
- Added
-
Personal Voice Enhancements: Added prosody and localization controls for Azure Personal Voice
- New properties on
AzurePersonalVoice:Locale- Primary locale for speech synthesisPreferLocales- List of preferred fallback localesRate- Speech rate adjustment (e.g., "+10%", "-15%")Pitch- Pitch adjustment (e.g., "+2st", "-1st")Volume- Volume adjustment (e.g., "+6dB", "-3dB")Style- Speaking style selectionCustomLexiconUrl- URL to custom pronunciation lexicon
- New properties on
-
Image Input Support: Added multimodal image input capabilities
- Added
RequestImageContentPartfor including images in user messages - Added
RequestImageContentPartDetailenum for image quality control ("auto", "low", "high") - Properties on
RequestImageContentPart:Url- Image URL (supports http://, https://, and data: URIs)Detail- Image processing detail level
- Added
-
Enhanced Token Tracking: Added image token usage metrics
- Added
ImageTokensproperty toInputTokenDetailsfor tracking image processing costs - Added
ImageTokensproperty toCachedTokenDetailsfor cached image token tracking
- Added
-
New Voice Options: Added new OpenAI voice presets
OAIVoice.Cedar- Additional voice optionOAIVoice.Marin- Additional voice option
-
Foundry Agent Support: Added support for agent-centric sessions using Azure AI Foundry agents
- Added
AgentSessionConfigclass for configuring Foundry agent sessions:AgentName- The name of the Foundry agent to useProjectName- The name of the Azure AI project which the agent belongs toAgentVersion- Optional version of the agent to useConversationId- Optional conversation ID to continueAuthenticationIdentityClientId- Optional client ID for user-assigned managed identityFoundryResourceOverride- Optional Foundry resource name for cross-resource agent mode
- Added
SessionTargetclass for specifying session targets (model or agent):SessionTarget.FromModel(string model)- Creates a model-centric session targetSessionTarget.FromAgent(AgentSessionConfig agentConfig)- Creates an agent-centric session target- Implicit conversions from
stringandAgentSessionConfig
- Added
-
Interim Response Configuration: Added support for interim responses during latency or tool execution delays
- Added
InterimResponseConfigBaseabstract base class withTriggersandLatencyThresholdMsproperties - Added
InterimResponseTriggerextensible enum (Latency,Tool) - Added
LlmInterimResponseConfigfor AI-generated interim responses withModel,Instructions, andMaxCompletionTokensproperties - Added
StaticInterimResponseConfigfor predefined static text interim responses withTextscollection
- Added
-
New
VoiceLiveClientSession Methods: Added new overloads for creating and starting sessionsCreateSession(string model)- Creates an unconnected session for a modelCreateSession(SessionTarget target)- Creates an unconnected session from a model or agent targetCreateSession(VoiceLiveSessionOptions sessionConfig)- Creates an unconnected session from session configurationStartSessionAsync(AgentSessionConfig agentConfig, ...)- Starts a connected session with a Foundry agentStartSessionAsync(SessionTarget target, ...)- Starts a connected session from a model or agent targetStartSessionAsync(SessionTarget target, VoiceLiveSessionOptions sessionConfig, ...)- Starts a connected session from a target with additional configuration
-
New Service Version: Added
ServiceVersion.V2026_01_01_PREVIEWtoVoiceLiveClientOptions