Azure/azure-sdk-for-net Azure.AI.VoiceLive

1.1.0-beta.2 (2026-02-19)

Features Added

MCP (Model Context Protocol) Support: Added comprehensive support for MCP server integration
- Added VoiceLiveMcpServerDefinition for configuring external MCP servers as tools
- Added MCPApprovalType enum for controlling tool execution approval workflows ("always", "never")
- Added VoiceLiveMcpTool class for representing MCP tool definitions with JSON schemas
- Added MCP-specific session update events:
  - SessionUpdateMcpListToolsInProgress - Tool discovery started
  - SessionUpdateMcpListToolsCompleted - Tool discovery completed
  - SessionUpdateMcpListToolsFailed - Tool discovery failed
  - SessionUpdateResponseMcpCallArgumentsDelta - Tool call arguments streaming
  - SessionUpdateResponseMcpCallArgumentsDone - Tool call arguments complete
  - SessionUpdateResponseMcpCallInProgress - Tool execution started
  - SessionUpdateResponseMcpCallCompleted - Tool execution completed
  - SessionUpdateResponseMcpCallFailed - Tool execution failed
- Added MCP-specific response items:
  - SessionResponseMcpListToolItem - Contains list of available tools from MCP server
  - SessionResponseMcpCallItem - Represents an MCP tool call with optional approval request
  - SessionResponseMcpApprovalRequestItem - Human-in-the-loop approval request
  - SessionResponseMcpApprovalResponseItem - Approval decision response
- MCP configuration properties on VoiceLiveMcpServerDefinition:
  - ServerLabel - Unique identifier for the MCP server
  - ServerUrl - HTTP endpoint for the MCP server
  - Authorization - Optional authorization header value
  - Headers - Custom HTTP headers for MCP server requests
  - AllowedTools - Optional list of tool names to enable (whitelist)
  - RequireApproval - Approval policy for tool execution
Enhanced Avatar Configuration: Extended avatar capabilities with new configuration options
- Added AvatarConfigTypes enum for avatar type selection ("photo_avatar", "video_avatar")
- Added PhotoAvatarBaseModes enum for photo avatar model selection ("vasa1")
- Added AvatarOutputProtocol enum for output protocol selection ("webrtc", "websocket")
- New properties on AvatarConfiguration:
  - Type - Avatar type (photo or video)
  - Model - Base model for photo avatars
  - OutputProtocol - Protocol for avatar data streaming
Personal Voice Enhancements: Added prosody and localization controls for Azure Personal Voice
- New properties on AzurePersonalVoice:
  - Locale - Primary locale for speech synthesis
  - PreferLocales - List of preferred fallback locales
  - Rate - Speech rate adjustment (e.g., "+10%", "-15%")
  - Pitch - Pitch adjustment (e.g., "+2st", "-1st")
  - Volume - Volume adjustment (e.g., "+6dB", "-3dB")
  - Style - Speaking style selection
  - CustomLexiconUrl - URL to custom pronunciation lexicon
Image Input Support: Added multimodal image input capabilities
- Added RequestImageContentPart for including images in user messages
- Added RequestImageContentPartDetail enum for image quality control ("auto", "low", "high")
- Properties on RequestImageContentPart:
  - Url - Image URL (supports http://, https://, and data: URIs)
  - Detail - Image processing detail level
Enhanced Token Tracking: Added image token usage metrics
- Added ImageTokens property to InputTokenDetails for tracking image processing costs
- Added ImageTokens property to CachedTokenDetails for cached image token tracking
New Voice Options: Added new OpenAI voice presets
- OAIVoice.Cedar - Additional voice option
- OAIVoice.Marin - Additional voice option
Foundry Agent Support: Added support for agent-centric sessions using Azure AI Foundry agents
- Added AgentSessionConfig class for configuring Foundry agent sessions:
  - AgentName - The name of the Foundry agent to use
  - ProjectName - The name of the Azure AI project which the agent belongs to
  - AgentVersion - Optional version of the agent to use
  - ConversationId - Optional conversation ID to continue
  - AuthenticationIdentityClientId - Optional client ID for user-assigned managed identity
  - FoundryResourceOverride - Optional Foundry resource name for cross-resource agent mode
- Added SessionTarget class for specifying session targets (model or agent):
  - SessionTarget.FromModel(string model) - Creates a model-centric session target
  - SessionTarget.FromAgent(AgentSessionConfig agentConfig) - Creates an agent-centric session target
  - Implicit conversions from string and AgentSessionConfig
Interim Response Configuration: Added support for interim responses during latency or tool execution delays
- Added InterimResponseConfigBase abstract base class with Triggers and LatencyThresholdMs properties
- Added InterimResponseTrigger extensible enum (Latency, Tool)
- Added LlmInterimResponseConfig for AI-generated interim responses with Model, Instructions, and MaxCompletionTokens properties
- Added StaticInterimResponseConfig for predefined static text interim responses with Texts collection
New VoiceLiveClient Session Methods: Added new overloads for creating and starting sessions
- CreateSession(string model) - Creates an unconnected session for a model
- CreateSession(SessionTarget target) - Creates an unconnected session from a model or agent target
- CreateSession(VoiceLiveSessionOptions sessionConfig) - Creates an unconnected session from session configuration
- StartSessionAsync(AgentSessionConfig agentConfig, ...) - Starts a connected session with a Foundry agent
- StartSessionAsync(SessionTarget target, ...) - Starts a connected session from a model or agent target
- StartSessionAsync(SessionTarget target, VoiceLiveSessionOptions sessionConfig, ...) - Starts a connected session from a target with additional configuration
New Service Version: Added ServiceVersion.V2026_01_01_PREVIEW to VoiceLiveClientOptions

Azure/azure-sdk-for-net Azure.AI.VoiceLive_1.1.0-beta.2 on GitHub

1.1.0-beta.2 (2026-02-19)

Features Added

Azure/azure-sdk-for-net Azure.AI.VoiceLive_1.1.0-beta.2
on GitHub