MLX Omni Server v0.1.1
MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. It implements OpenAI-compatible API endpoints, enabling seamless integration with existing OpenAI SDK clients while leveraging the power of local ML inference.
🚀 Key Features
OpenAI Compatible API Endpoints
/v1/chat/completions
- Support for chat, tools/function calling, and LogProbs/v1/audio/speech
- Text-to-Speech capabilities/v1/audio/transcriptions
- Speech-to-Text processing/v1/models
- Model listing and management/v1/images/generations
- Image generation
Core Capabilities
- Optimized for Apple Silicon (M1/M2/M3 series) chips
- Full local inference for privacy
- Multiple AI capabilities:
- Audio Processing (TTS & STT)
- Chat Completion
- Image Generation
- High Performance with hardware-accelerated local inference
- Privacy-First: All processing happens locally on your machine
- SDK Support: Works with official OpenAI SDK and other compatible clients