Platform Detection Improvements - Hardware Acceleration
Overview
This release fixes a critical performance issue where Apple Silicon M1/M2/M3 Macs used CPU-only PyTorch instead of Metal Performance Shaders (MPS), resulting in 3-5x slower embedding generation.
What's Fixed
Problem:
update_and_restart.shused simple bash-only detection- All macOS treated as CPU-only (no MPS support)
- No support for ROCm (AMD) or DirectML (Windows)
- Only detected NVIDIA via nvidia-smi on Linux
Solution:
- Created
scripts/utils/detect_platform.pyusing sharedgpu_detection.pymodule - Enhanced
update_and_restart.shwith comprehensive hardware detection - Optimal PyTorch index selection per platform (MPS, CUDA, ROCm, DirectML, CPU)
- Graceful fallback to old bash logic if helper unavailable
Benefits
- ⚡ 3-5x faster embedding generation on Apple Silicon
- 🎯 Comprehensive detection: MPS, CUDA (cu121/cu118/cu102), ROCm (rocm5.6), DirectML
- 🔧 Consistent logic with
install.py - 📚 Well documented with platform comparison guide
Added
-
Platform Detection Helper (
scripts/utils/detect_platform.py)- Python-based detection using shared
gpu_detection.pymodule - JSON output for bash consumption
- Comprehensive platform support: MPS, CUDA, ROCm, DirectML, CPU
- Python-based detection using shared
-
Documentation (
scripts/utils/README_detect_platform.md)- Platform comparison table (Old vs New)
- Example outputs for different hardware
- Integration details with
update_and_restart.sh
Technical Details
Files Changed:
scripts/update_and_restart.sh- Enhanced platform detection (lines 230-320)scripts/utils/detect_platform.py- New Python detection helperscripts/utils/README_detect_platform.md- Comprehensive documentation
Code Quality:
- 10 Gemini review suggestions implemented
- Efficient JSON parsing (single Python process)
- Robust error handling with defaults
- Strict validation (exact array length checks)
Full Changelog: https://github.com/doobidoo/mcp-memory-service/blob/main/CHANGELOG.md#8682---2026-01-04