v1.0.23 Release Notes
🎙️ On-Device Speech Recognition
Memex now supports fully offline speech-to-text powered by sherpa-onnx + SenseVoice-Small. All audio processing runs entirely on-device — no data leaves your phone.
Features:
- Real-time voice-to-text with VAD (Voice Activity Detection) — transcription appears as you speak
- Periodic calibration during recording for improved accuracy, with a final full-audio calibration on stop
- Long press the mic button to import and transcribe an existing audio file (m4a/mp3/wav/ogg/aac/flac)
- Hardware acceleration: CoreML on iOS, NNAPI on Android
- First-time use requires a one-time model download (~230MB), with China mirror option for faster downloads
- Supports Chinese, English, Japanese, Korean, and Cantonese with automatic language detection
Technical details:
- Silero VAD (bundled, 629KB) for real-time speech segmentation
- Background Isolate for all transcription — zero UI blocking
- Native audio format conversion via AVAssetReader (iOS) / MediaCodec (Android)
- 60-second recording limit; no limit for imported audio files