github Azure/azure-sdk-for-python azure-ai-voicelive_1.0.0b5

latest release: azure-ai-agents_1.2.0b5
pre-release5 hours ago

1.0.0b5 (2025-09-26)

Features Added

  • Enhanced Semantic Detection Type Safety: Added new EouThresholdLevel enum for better type safety in end-of-utterance detection:
    • LOW for low sensitivity threshold level
    • MEDIUM for medium sensitivity threshold level
    • HIGH for high sensitivity threshold level
    • DEFAULT for default sensitivity threshold level
  • Improved Semantic Detection Configuration: Enhanced semantic detection classes with better type annotations:
    • threshold_level parameter now supports both string values and EouThresholdLevel enum
    • Cleaner type definitions for AzureSemanticDetection, AzureSemanticDetectionEn, and AzureSemanticDetectionMultilingual
    • Improved documentation for threshold level parameters
  • Comprehensive Unit Test Suite: Added extensive unit test coverage with 200+ test cases covering:
    • All enum types and their functionality
    • Model creation, validation, and serialization
    • Async connection functionality with proper mocking
    • Client event handling and workflows
    • Voice configuration across all supported types
    • Message handling with content part hierarchy
    • Integration scenarios and real-world usage patterns
    • Recent changes validation and backwards compatibility
  • API Version Update: Updated to API version 2025-10-01 (from 2025-05-01-preview)
  • Enhanced Type Safety: Added new AzureVoiceType enum with values for better Azure voice type categorization:
    • AZURE_CUSTOM for custom voice configurations
    • AZURE_STANDARD for standard voice configurations
    • AZURE_PERSONAL for personal voice configurations
  • Improved Message Handling: Added MessageRole enum for better role type safety in message items
  • Enhanced Model Documentation: Comprehensive documentation improvements across all models:
    • Added detailed docstrings for model classes and their parameters
    • Enhanced enum value documentation with descriptions
    • Improved type annotations and parameter descriptions
  • Enhanced Semantic Detection: Added improved configuration options for all semantic detection classes:
    • Added threshold_level parameter with options: "low", "medium", "high", "default" (recommended over deprecated threshold)
    • Added timeout_ms parameter for timeout configuration in milliseconds (recommended over deprecated timeout)
  • Video Background Support: Added new Background model for video background customization:
    • Support for solid color backgrounds in hex format (e.g., #00FF00FF)
    • Support for image URL backgrounds
    • Mutually exclusive color and image URL options
  • Enhanced Video Parameters: Extended VideoParams model with:
    • background parameter for configuring video backgrounds using the new Background model
    • gop_size parameter for Group of Pictures (GOP) size control, affecting compression efficiency and seeking performance
  • Improved Type Safety: Added TurnDetectionType enum for better type safety and IntelliSense support
  • Package Structure Modernization: Simplified package initialization with namespace package support
  • Enhanced Error Handling: Added ConnectionError and ConnectionClosed exception classes to the async API for better WebSocket error management

Breaking Changes

  • Cross-Language Package Identity Update: Updated package ID from VoiceLive to VoiceLive.WebSocket for better cross-language consistency
  • Model Refactoring:
    • Renamed UserContentPart to MessageContentPart for clearer content part hierarchy
    • All message items now require a content field with list of MessageContentPart objects
    • OutputTextContentPart now inherits from MessageContentPart instead of being standalone
  • Enhanced Type Safety:
    • Azure voice classes now use AzureVoiceType enum discriminators instead of string literals
    • Message role discriminators now use MessageRole enum values for better type safety
  • Removed Deprecated Parameters: Completely removed deprecated parameters from semantic detection classes:
    • Removed threshold parameter from all semantic detection classes (AzureSemanticDetection, AzureSemanticDetectionEn, AzureSemanticDetectionMultilingual)
    • Removed timeout parameter from all semantic detection classes
    • Users must now use threshold_level and timeout_ms parameters respectively
  • Removed Synchronous API: Completely removed synchronous WebSocket operations to focus exclusively on async patterns:
    • Removed sync connect() function and sync VoiceLiveConnection class from main patch implementation
    • Removed sync basic_voice_assistant.py sample (only async version remains)
    • Simplified sync patch to minimal structure with empty exports
    • All functionality now available only through async patterns
  • Updated Dependencies: Modified package dependencies to reflect async-only architecture:
    • Moved aiohttp>=3.9.0,<4.0.0 from optional to required dependency
    • Removed websockets optional dependency as sync API no longer exists
    • Removed optional dependency groups websockets, aiohttp, and all-websockets
  • Model Rename:
    • Renamed AudioInputTranscriptionSettings to AudioInputTranscriptionOptions for consistency with naming conventions
    • Renamed AzureMultilingualSemanticVad to AzureSemanticVadMultilingual for naming consistency with other multilingual variants
  • Enhanced Type Safety: Turn detection discriminator types now use enum values instead of string literals for better type safety

Bug Fixes

  • Serialization Improvements: Fixed type casting issue in serialization utilities for better enum handling and type safety

Other Changes

  • Testing Infrastructure: Added comprehensive unit test suite with extensive coverage:
    • 8 main test files with 200+ individual test methods
    • Tests for all enums, models, async operations, client events, voice configurations, and message handling
    • Integration tests covering real-world scenarios and recent changes
    • Proper mocking for async WebSocket connections
    • Backwards compatibility validation
    • Test coverage for all recent changes and enhancements
  • API Documentation: Updated API view properties to reflect model structure changes, new enums, and cross-language package identity
  • Documentation Updates: Comprehensive updates to all markdown documentation:
    • Updated README.md to reflect async-only nature with updated examples and installation instructions
    • Updated samples README.md to remove sync sample references
    • Enhanced BASIC_VOICE_ASSISTANT.md with comprehensive async implementation guide
    • Added MIGRATION_GUIDE.md for users upgrading from previous versions

Don't miss a new azure-sdk-for-python release

NewReleases is sending notifications on new releases.