π What's New
Text Embeddings Support
- New
/v1/embeddings
API endpoint - Full OpenAI compatibility - Built-in embedding generation for semantic search and RAG applications
- Batch processing with automatic padding and optimization
Enhanced Chat Capabilities
- System prompts support - Define AI assistant behavior with system messages
- Multi-turn conversations - Maintain conversation history across requests
- Full OpenAI ChatGPT API compatibility - Drop-in replacement for chat applications
Intelligent Memory Management
- Automatic model eviction - Prevents GPU memory exhaustion with smart cleanup
- Usage-based prioritization - Keeps frequently used models in memory
- Concurrent inference control - Safe parallel processing with per-model locking
π Usage
Generate Embeddings
curl -X POST http://localhost:28100/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"input": ["Hello world", "Text embeddings"],
"model": "mlx-community/Qwen3-Embedding-0.6B-4bit-DWQ"
}'
Chat with System Prompts
curl -X POST http://localhost:28100/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "system", "content": "You are a helpful math tutor."},
{"role": "user", "content": "Explain quadratic equations."}
],
"model": "qwen3"
}'
Multi-turn Conversations
curl -X POST http://localhost:28100/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "My name is Alice."},
{"role": "assistant", "content": "Nice to meet you, Alice!"},
{"role": "user", "content": "What is my name?"}
],
"model": "qwen3"
}'
π¦ Download
π Upgrade Notes
- If upgrading from a previous version: After installing the new version, open Swama from the menu bar and click "Install Command Line Toolβ¦" to update the CLI tools
π§ Requirements
- macOS 14.0+
- Apple Silicon (M1/M2/M3/M4)
What's Changed
- Fix: Correct command examples in README files by @djx-trans-n in #2
- Fix Chinese ReadMe for downloading by @mmRose-MIAO in #4
- FEAT:Support model aliases for api model handler by @mmRose-MIAO in #6
- feat: Add comprehensive AI model performance benchmark script (Ollama vs Swama) by @syh-trans-n in #7
- feat: add comprehensive AI model benchmark script with expanded model support by @syh-trans-n in #9
- Fix support info readme by @Li-Haojie-1106 in #11
- add embedding models support by @sxy-trans-n in #13
- Add Chat Message Support and Memory Management by @sxy-trans-n in #14
New Contributors
- @djx-trans-n made their first contribution in #2
- @mmRose-MIAO made their first contribution in #4
- @syh-trans-n made their first contribution in #7
- @Li-Haojie-1106 made their first contribution in #11
Full Changelog: v1.0.0...v1.1.0