1.0.0 (2026-06-01)
This is the first General Availability (GA) release of the Azure VoiceLive client library for Java.
Breaking Changes
- Narrowed
VoiceLiveAsyncClientsession startup to three overloads:startSession()startSession(String, VoiceLiveRequestOptions)startSession(AgentSessionConfig, VoiceLiveRequestOptions)
- Renamed token-count accessors on token statistic models (JSON wire format unchanged):
CachedTokenDetails.getTextTokens()/getAudioTokens()/getImageTokens()→getTextTokenCount()/getAudioTokenCount()/getImageTokenCount()InputTokenDetails.getCachedTokens()/getTextTokens()/getAudioTokens()/getImageTokens()→getCachedTokenCount()/getTextTokenCount()/getAudioTokenCount()/getImageTokenCount()OutputTokenDetails.getTextTokens()/getAudioTokens()/getReasoningTokens()→getTextTokenCount()/getAudioTokenCount()/getReasoningTokenCount()ResponseTokenStatistics.getTotalTokens()/getInputTokens()/getOutputTokens()→getTotalTokenCount()/getInputTokenCount()/getOutputTokenCount()
RequestImageContentPartURL accessor renamed and JSON field changed:getUrl()/setUrl(String)→getImageUrl()/setImageUrl(String)- JSON property
url→image_url
- Renamed base event types for client↔server symmetry:
ClientEvent(base for outbound events) →SessionClientEventSessionUpdate(base for inbound events) →SessionServerEventVoiceLiveSessionAsyncClient.receiveEvents()now returnsFlux<SessionServerEvent>VoiceLiveSessionAsyncClient.sendEvent(...)now acceptsSessionClientEvent
- Renamed MCP-related model types to Pascal case (
MCP*→Mcp*):McpApprovalType,McpServer,McpTool,McpApprovalResponseRequestItem,ResponseMcpApprovalRequestItem,ResponseMcpApprovalResponseItem,ResponseMcpCallItem,ResponseMcpListToolItem. VoiceLiveSessionAsyncClient.truncateConversation(String, int, int)now accepts ajava.time.Durationfor the audio-end position instead of raw milliseconds. The two-argument overload (itemId,contentIndex) is preserved and defaults toDuration.ZERO.- Removed
sendInputAudio(byte[]); usesendInputAudio(BinaryData)(wrap raw bytes withBinaryData.fromBytes(...)). AgentSessionConfig.toQueryParameters()is no longer part of the public API; the conversion is handled internally byVoiceLiveAsyncClient.VoiceLiveSessionOptions.setAnimation(...)renamed tosetAnimationOptions(...).AnimationOptions.setOutputs(...)/getOutputs()renamed tosetOutputTypes(...)/getOutputTypes().LogProbProperties.getLogprob()renamed togetLogProb().SessionUpdateConversationItemInputAudioTranscriptionCompleted.getLogprobs()renamed togetLogProbs().- Removed preview service versions from
VoiceLiveServiceVersion; only GA versions remain (V2025_10_01,V2026_04_10). The latest version is nowV2026_04_10.
Features Added
- Avatar voice synchronization for video avatars:
- New
AzureVoiceType.AVATAR_VOICE_SYNCandAzureAvatarVoiceSyncVoiceclass - New server events
ServerEventSessionAvatarSwitchToSpeaking/ServerEventSessionAvatarSwitchToIdle - New
ServerEventResponseVideoDeltafor streaming avatar video frames - New
ClientEventOutputAudioBufferClear(output_audio_buffer.clear) andServerEventOutputAudioBufferCleared(output_audio_buffer.cleared) for clearing the avatar output audio buffer
- New
- Web search and file search tool calls:
- New
ItemType.WEB_SEARCH_CALL,ItemType.FILE_SEARCH_CALL - New
ResponseWebSearchCallItem(withResponseWebSearchCallItemStatus) andResponseFileSearchCallItem(withResponseFileSearchCallItemStatus, plusFileSearchResultresults) - New lifecycle server events:
ServerEventResponseWebSearchCall{Searching,InProgress,Completed}andServerEventResponseFileSearchCall{Searching,InProgress,Completed}
- New
- Transcription enhancements:
- New transcription models on
AudioInputTranscriptionOptionsModel:GPT_4O_TRANSCRIBE_DIARIZE,MAI_TRANSCRIBE_1 - New
TranscriptionPhraseandTranscriptionWordtypes with timing/confidence information SessionUpdateConversationItemInputAudioTranscriptionCompletednow exposesgetLogProbs()andgetPhrases()- New
ServerEventResponseAudioTranscriptAnnotationAddedevent
- New transcription models on
- Session include options and metadata:
- New
SessionIncludeOptionexpandable enum for opting into additional response payloads (e.g. logprobs, phrases, file-search results) VoiceLiveSessionOptionsandVoiceLiveSessionResponsenow exposeinclude(List<SessionIncludeOption>) andmetadata(Map<String,String>, up to 16 entries)
- New
- Personal voice models: added
PersonalVoiceModels.DRAGON_HDOMNI_LATEST_NEURALandMAI_VOICE_1 - Reasoning token usage:
OutputTokenDetails.getReasoningTokenCount()exposes reasoning token counts - Interim response on response.create:
ResponseCreateParams.setInterimResponse(BinaryData)lets callers attach interim response config to a single response request - Restored no-arg
VoiceLiveAsyncClient.startSession()overload (uses the deployment's default model). - Significantly improved Javadoc for
ServerVadTurnDetection,AzureCustomVoice,AzurePersonalVoice,AzureStandardVoice,AzureSemanticVadTurnDetection*, and other model types
Other Changes
- Updated default service API version to
2026-04-10(GA).