React Native ExecuTorch v0.9.0 🔒 🚀
This release, as always, brings a lot of fresh goodies to the React Native world ⚛️. It introduces new things across the stack, whether that's speech, vision, or NLP, check it out 👇🏻
What's new?
Speech
- Up to 10x performance improvements for transcription - We introduced a way to integrate VAD into Whisper transcription pipeline, along with underlying algorithm optimizations & NPU model exports, resulting in significantly improved performance docs
- Multilingual Text-to-Speech - We introduced a lot of new languages to the TTS API, including 🇵🇱 🇪🇸 🇮🇹 🇫🇷 🇮🇳 docs
- Text-to-Speech improvements - We improved the phonemization mechanisms to significantly enhance the pronunciation quality for existing languages
- VAD streaming API - You can now run voice activity detection in continuous, stream fashion docs
NLP
- OpenAI's Privacy Filter - A new
usePrivacyFilterAPI, making it easy to detect private user information from text, fully locally docs - Multilingual text embeddings - It is now possible to generate embeddings across different languages docs
- Qwen 3.5 support - Another LLM in our collection docs
- LFM2.5-VL-450M (quantized) & LFM2.5 350M - New multimodal and text-only LLMs from Liquid AI added to the registry
docs - Smarter LLM sampling -
min_pandrepetition_penaltysampling, per-model defaults, and letterboxed multimodal image preprocessing docs
Computer vision
- Pose estimation - A new Pose Estimation API with YOLO 26 docs
- FastSAM with prompts - Segment Anything style masks driven by point, box, or text prompts, fully on-device docs
General
- ExecuTorch runtime bumped to v1.2.0 - Updated native binaries and headers across iOS and Android, including the new
AlreadyLoadederror mapping docs - Typed
modelsregistry - A new declarativemodelsaccessor inconstants/modelRegistry.ts, grouped one-to-one with hooks, so you can pick a backend / precision without juggling URL constants docs - Extended tokenizer support - Rebuilt tokenizer binaries with unigram & word-level support, plus previously unsupported pre-tokenizers, decoders, and post-processors
- Graceful native-lib degradation on Android - Apps no longer crash at startup on 32-bit / unsupported ABIs;
isAvailable === falselets you render a fallback UI instead docs - iOS deployment target raised to 17.0 - Matches the prebuilt ExecuTorch libraries; apps targeting older iOS versions will need to bump
Breaking changes ⚠️
TextToImagereturns afile://URI instead of base64 —useTextToImage().generate()(andTextToImageModule.forward) now resolves to a path on disk pointing to a PNG, instead of a base64-encoded payload. Thepngjsdependency is no longer needed.- Multilingual TTS API reshape — The Kokoro pipeline now takes per-language voice / voice-quality configuration to support voice cloning and fine-tuned variants. Existing TTS call sites need to migrate.
- Speech-to-text quantized model constants removed — Predefined constants for quantized Whisper variants were dropped; the quality delta vs. the originals didn't justify the extra API surface. iOS now defaults to the new CoreML-backed Whisper builds.
OCRDetection.bboxis nowBbox, notPoint[]— The native vision stack moved to a unifiedPointstruct everywhere; consumers reading raw bbox geometry off OCR results need to update.ResourceFetcherAdapter.fetchreturn shape changed — Custom adapters must now return{ path: string; wasDownloaded: boolean }[]instead ofstring[], so callers can tell which resources were freshly downloaded vs. served from cache. Update any custom adapter implementations accordingly. docs