1.0.0-beta.3 (2025-12-03)
Features Added
- Added image input support for multimodal conversations:
RequestImageContentPartfor including images in conversation messages with URL referencesRequestImageContentPartDetailenum for controlling image detail level (auto, low, high)ContentPartType.INPUT_IMAGEdiscriminator for image content parts
- Added avatar configuration enhancements:
AvatarConfigurationclass for configuring avatar streaming and behavior with ICE servers, character selection, style, and video parametersAvatarConfigTypesenum for video and photo avatar typesAvatarOutputProtocolenum supporting WebRTC and WebSocket protocolsPhotoAvatarBaseModesenum with VASA-1 model support
- Added token usage tracking improvements:
CachedTokenDetailsfor tracking cached text, audio, and image tokens- Enhanced
InputTokenDetailswith image token tracking and cached token details
- Added MCP call lifecycle events:
ServerEventResponseMcpCallInProgressfor tracking ongoing MCP callsServerEventResponseMcpCallCompletedfor successful MCP call completionServerEventResponseMcpCallFailedfor failed MCP calls
- Added two new OpenAI voices:
OpenAIVoiceName.MARINandOpenAIVoiceName.CEDAR - Enhanced
AzurePersonalVoicewith additional customization options:- Custom lexicon URL support for pronunciation customization
- Locale preferences with
preferLocalesfor multilingual scenarios - Voice style, pitch, rate, and volume controls for fine-tuned voice characteristics