Azure/azure-sdk-for-java com.azure+azure-ai-voicelive

1.0.0 (2026-06-01)

This is the first General Availability (GA) release of the Azure VoiceLive client library for Java.

Breaking Changes

Narrowed VoiceLiveAsyncClient session startup to three overloads:
- startSession()
- startSession(String, VoiceLiveRequestOptions)
- startSession(AgentSessionConfig, VoiceLiveRequestOptions)
Renamed token-count accessors on token statistic models (JSON wire format unchanged):
- CachedTokenDetails.getTextTokens() / getAudioTokens() / getImageTokens() → getTextTokenCount() / getAudioTokenCount() / getImageTokenCount()
- InputTokenDetails.getCachedTokens() / getTextTokens() / getAudioTokens() / getImageTokens() → getCachedTokenCount() / getTextTokenCount() / getAudioTokenCount() / getImageTokenCount()
- OutputTokenDetails.getTextTokens() / getAudioTokens() / getReasoningTokens() → getTextTokenCount() / getAudioTokenCount() / getReasoningTokenCount()
- ResponseTokenStatistics.getTotalTokens() / getInputTokens() / getOutputTokens() → getTotalTokenCount() / getInputTokenCount() / getOutputTokenCount()
RequestImageContentPart URL accessor renamed and JSON field changed:
- getUrl() / setUrl(String) → getImageUrl() / setImageUrl(String)
- JSON property url → image_url
Renamed base event types for client↔server symmetry:
- ClientEvent (base for outbound events) → SessionClientEvent
- SessionUpdate (base for inbound events) → SessionServerEvent
- VoiceLiveSessionAsyncClient.receiveEvents() now returns Flux<SessionServerEvent>
- VoiceLiveSessionAsyncClient.sendEvent(...) now accepts SessionClientEvent
Renamed MCP-related model types to Pascal case (MCP* → Mcp*): McpApprovalType, McpServer, McpTool, McpApprovalResponseRequestItem, ResponseMcpApprovalRequestItem, ResponseMcpApprovalResponseItem, ResponseMcpCallItem, ResponseMcpListToolItem.
VoiceLiveSessionAsyncClient.truncateConversation(String, int, int) now accepts a java.time.Duration for the audio-end position instead of raw milliseconds. The two-argument overload (itemId, contentIndex) is preserved and defaults to Duration.ZERO.
Removed sendInputAudio(byte[]); use sendInputAudio(BinaryData) (wrap raw bytes with BinaryData.fromBytes(...)).
AgentSessionConfig.toQueryParameters() is no longer part of the public API; the conversion is handled internally by VoiceLiveAsyncClient.
VoiceLiveSessionOptions.setAnimation(...) renamed to setAnimationOptions(...).
AnimationOptions.setOutputs(...) / getOutputs() renamed to setOutputTypes(...) / getOutputTypes().
LogProbProperties.getLogprob() renamed to getLogProb().
SessionUpdateConversationItemInputAudioTranscriptionCompleted.getLogprobs() renamed to getLogProbs().
Removed preview service versions from VoiceLiveServiceVersion; only GA versions remain (V2025_10_01, V2026_04_10). The latest version is now V2026_04_10.

Features Added

Avatar voice synchronization for video avatars:
- New AzureVoiceType.AVATAR_VOICE_SYNC and AzureAvatarVoiceSyncVoice class
- New server events ServerEventSessionAvatarSwitchToSpeaking / ServerEventSessionAvatarSwitchToIdle
- New ServerEventResponseVideoDelta for streaming avatar video frames
- New ClientEventOutputAudioBufferClear (output_audio_buffer.clear) and ServerEventOutputAudioBufferCleared (output_audio_buffer.cleared) for clearing the avatar output audio buffer
Web search and file search tool calls:
- New ItemType.WEB_SEARCH_CALL, ItemType.FILE_SEARCH_CALL
- New ResponseWebSearchCallItem (with ResponseWebSearchCallItemStatus) and ResponseFileSearchCallItem (with ResponseFileSearchCallItemStatus, plus FileSearchResult results)
- New lifecycle server events: ServerEventResponseWebSearchCall{Searching,InProgress,Completed} and ServerEventResponseFileSearchCall{Searching,InProgress,Completed}
Transcription enhancements:
- New transcription models on AudioInputTranscriptionOptionsModel: GPT_4O_TRANSCRIBE_DIARIZE, MAI_TRANSCRIBE_1
- New TranscriptionPhrase and TranscriptionWord types with timing/confidence information
- SessionUpdateConversationItemInputAudioTranscriptionCompleted now exposes getLogProbs() and getPhrases()
- New ServerEventResponseAudioTranscriptAnnotationAdded event
Session include options and metadata:
- New SessionIncludeOption expandable enum for opting into additional response payloads (e.g. logprobs, phrases, file-search results)
- VoiceLiveSessionOptions and VoiceLiveSessionResponse now expose include (List<SessionIncludeOption>) and metadata (Map<String,String>, up to 16 entries)
Personal voice models: added PersonalVoiceModels.DRAGON_HDOMNI_LATEST_NEURAL and MAI_VOICE_1
Reasoning token usage: OutputTokenDetails.getReasoningTokenCount() exposes reasoning token counts
Interim response on response.create: ResponseCreateParams.setInterimResponse(BinaryData) lets callers attach interim response config to a single response request
Restored no-arg VoiceLiveAsyncClient.startSession() overload (uses the deployment's default model).
Significantly improved Javadoc for ServerVadTurnDetection, AzureCustomVoice, AzurePersonalVoice, AzureStandardVoice, AzureSemanticVadTurnDetection*, and other model types

Other Changes

Updated default service API version to 2026-04-10 (GA).

Azure/azure-sdk-for-java com.azure+azure-ai-voicelive_1.0.0 on GitHub

1.0.0 (2026-06-01)

Breaking Changes

Features Added

Other Changes

Azure/azure-sdk-for-java com.azure+azure-ai-voicelive_1.0.0
on GitHub