google-ai-edge/mediapipe v0.10.32 on GitHub

Build changes

Enables ml drift metal delegate as inference calculator backend.
[mediapipe] support armv7 (32 bit in mediapipe tasks)
Do not assume canvas is BGRA in RenderToWebGpuCanvas.
Fix sampling logic in ImageToTensorConverterWebGpu.
Migrate GlShaderCalculator to API3.
Migrate gl_shader_calculator_test to use API3 builder.

Bazel changes

[mediapipe] verion bump to 0.10.27
Dawn has completed these changes, so the old paths are no longer used.
Integrate tiny Juno inpainting graph into GenAiProcessor
Readme for API3
Add Resources::ResolveId to enable placeholder resource ids usage.
Web LLM: a few more small edits for Gemma3n
Include headers from global namespace
Add comment to Eigen version in WORKSPACE to remind about synchronization with TensorFlow's Eigen dependency
Migrating VisibilityCopyCalculator to API3.
Update from Bazel v6.5.0 to v7.4.1, Protobuf v3.19.1 to v5.28.3. Other packages also update the version within WORKSPACE.
Fix for weight cache on Windows.
Create Selfie Segmentation Demo App for LiteRT NPU.
Add Any support for API3
Adding AudioBuffer support to web LLM Inference API to handle more audio input types for MM models
Provide API3 interface for PassThroughCalculator using newly added Any type.
Migrate MergeCalculator to API3 and newly introduced Any type.
pybind11 version and py_proto_library macro update.
Add test for PacketResamplerCalculator with a very short video.
Initial version of sync function runner for API3
Fix function runner error reporting.
[mediapipe] version bump
Migrate CombinedPredictionCalculator to API3
Clean up CombinedPredictionCalculator
Currently, wrapping a TextureFrame in a media-pipe Packet assumes the texture is 8-bit RGBA. This patch allows specifying other texture formats to support common color formats like RGBA16F for HDR content.
Support timestamp bound updates in function runner.
Migrate TensorsToSegmentationCalculator to MediaPipe API3.
Add OneOf support for API3.
Provide ineference calculator API3 interface.
Migrate LandmarksToMatrixCalculator to API3
Update MediaPipe OSS to C++20.
Add a flag to use fp16 activations in tests.
Migrate HandednessToMatrixCalculator to API3.
Update xnnpack version.
Use the new xnn_reduce_mean_squared reduction for the RMSNorm.
Migrate ImageToTensorCalculator to API3.
Consistently use MutexLock instead of manual locking/unlocking
Add ImageProcessingOptions to FaceDetector C API
Enable node names as compile time strings in OSS.
Migrate API3 nodes to use compile time string names.
Fall back to producer context in gpu_buffer.GetReadView
Document api3 GetOrDie / VisitOrDie
Update log for missing InferenceCalculatorXnnpack registration.
Add NodeName for non-generic calculator context.
Add ImageProcessingOptions support to FaceLandmarker C API.
Migrate FaceLandmarker C API to use MediaPipe Image
Update CombinedPredictionCalculator test to new Runner
Fix comment about when things die.
Migrate WebGpuShaderCalculator to MediaPipe API3.
Proto changes for Tiny Gemma on ml_drift
Enable API3 FunctionRunner for WEB
Bump MediaPipe version to 0.10.29.
Add CompareAndSaveImageOutputDynamic to compare to a dynamic golden instead of a file.
Bump MediaPipe version to 0.10.30.
Improve error message of graph validation, to include node calculator name
Refactor Hand Landmarker C API to use new MP Image
Add ExternalGlTextureSyncMode to require efficient synchronization.
Add an option to get a Packet for API3 OneOf input.
Migrate GpuBufferToImageFrameCalculator to API3.
Add support to pass a single visitor in VisitOrDie for OneOf inputs.
Add VisitAsPacketOrDie for OneOf inputs.
Add MediaPipe Tasks C API for AudioClassifier.
Ensure correct type of #api3 Packet.
Support fractional frame rates in MediaPipe video processing.
Add ImageProcessingOptions support to Object Detector C API.
Add ImageProcessingOptions support to MediaPipe PoseLandmarker C API.
Update object detector to apply ImageFrame C API
Update pose landmarker to apply ImageFrame C API
Migrate HandAssociationCalculator to MediaPipe API3.
Adding ImageProcesingOptions to image_classifier C API.
Adding ImageProcesingOptions to gesture_recognizer C API.
Migrate GestureRecognizer (C API) to MpImagePtr
Migrate ImageFrameToGpuBufferCalculator to API3
Set default thread_num to LiteRT::CPU delegate
Qualify Packet and MakePacket while in mediapipe::api3 namespace to avoid future collisions with api3::Packet / api3::MakePacket
Remove redundant empty parentheses from lambdas in MediaPipe API3.
Migrate ImageClassifier (C API) to MpImagePtr.
Update HandAssociationCalculatorTest to new Runner
Add ImageProcessingOptions support to Image Segmenter C API
Migrate TensorsToSegmentationCalculator to API3
Extend lifetime of Image data when MpImage is constructed from an existing MediaPipe Image
Allow GetData() calls for contiginuous images
Migrate ImageSegmenter C API to MpImagePtr
Add GetLabels() to the ImageSegmenter C API
Generalize UnpackMediaSequenceCalculator's support for encoded media streams.
Clean up unused variables
Get rid of _with_options in favor of optional param (C & Python API)
Migrate Python ImageSegmenter to C API
Refactor Image Embedder C API to use MpImage
Simplify FunctionRunner template types.
Simplify setting options in HandAssociationCalculatorTest
Update MediaPipe C API vision task result callbacks to use MpStatus.
Offer attachments functionality from WebGPU service.
Enable creation of WebGPU service from explicitly provided wgpu::Device.
Remove unnecessary checks.
Enables RGBA input with RGB output.
Modify the Has{} generic function to check across the different value kinds of
Refactor TextClassifier C API to use MpStatus.
Update C API for TextEmbedder to use MpStatus
Update C API for LanguageDetector to return MpStatus
Add ImageProcessingOptions support to the MediaPipe ImageEmbedder C API.
Add ImageProcessingOptions to InteractiveSegmenter C API and migrate to MpImage
Fix counting pixels with different colors for image comparison tests
Remove redundant has_confidence_masks field from ImageSegmenterResult.
Migrate ImageClassifier C API to use MpStatus.
Refactor Face Detector C API to use MpStatus.
Update ImageSegmenter C API to return MP Status
Add support for XNNPACK's SLOW_CONSISTENT_ARITHMETIC flag
Update ImageEmbedder C API to return MP Status
Update InteractiveSegmenter C API to return MP Status
Get rid of _with_options in favor of optional param (C)
Refactor Gesture Recognizer C API to use MpStatus.
Update ObjectDetector C API to return MpStatus
Update HandLandmarker C API to return MP Status
Update PoseLandmarker C API to return MpStatus
Fix libmediapipe.so compilation on Windows
Remove no longer used Image types
Add side packets support for FunctionRunner.
Fix documentation.
Update bot assignees in bot_config.yml.
Added new R8 mode for GlShaderCalculator
Add visibility declarations for Windows
Centralize WebGPU header includes
Web Solutions: patch for importScripts error with modules in workers
Small cleanups in MP Task C++ segmentation graphs and ModelTaskGraph
Update AudioClassifier to retain error messages
Retain error messages in the Metadata API
Retain error messages in Language Detector
Update GestureRecognizer to retain error messages
Update FaceLandmarker C API to use new naming and return type convention
Retain error messages in TextEmbedder
Retain error messages in HandLandmarker
Retain error messages in ImageEmbedder
Retain Error Messages in Object Detector
Small cleanup of scheduler_queue
Refactor ImageClassifier C API to return error messages.
Allow empty Tensors in InferenceCalculator
Update ImageSegmenter to retain error messages
Retain error messages for PoseLandmarker
Retain error messages in MpImage
Allow packing of input streams into empty SequenceExample.
Retain error messages in Interactive Segmenter
Replace custom test macros with EXPECT_EQ/ASSERT_EQ
Add MpErrorFree to avoid missing function on Windows
Bump MP version to 0.10.31
Add Kotlin support to MediaPipe OSS repo
Prepare for functiongemma with MP web LLM API
Add experimental mapSync support in GetTexture2dData.
Fail with error at empty decoded image in OpenCVEncodedImageToImageFrameCalculator. Empty decoded image can happen if decoding fails.
Support option dependencies in mediapipe proto rules
Fix logging large one-dimensional vertical data
Adds GPU output support for category masks (copy only, result listener zero-copy case is not addressed yet)
Add wgpu::ExternalTexture support

MediaPipe Tasks update

This section should highlight the changes that are done specifically for any platform and don't propagate to
other platforms.

Android

Allow users to configure NPU delegate
Remove all references to subgraph reshaping which is enabled by default
Don't swallow Task exceptions for synchronous use cases
Update score thresholds for Java classifier/embedder tests
Restore default .so location
Add RegionOfInterest Proto to Java Protobuf list
Don't assume images are RGB
Allow users to configure the NPU delegate

iOS

Expose preferredBackends
Add stream cancellation support in swift API.
Add audio modality support to iOS GenAI inference.
Use wrapper type for RenderData
Remove MP AudioEmbedder

Javascript

Web LLM: basic .wav audio support for Gemma 3N
Web LLM: Small fix to multimodal error message strings
Add a test case to test empty packet inputs.
Migrate tasks tests to use common image test util.
Enables GraphRuntimeInfo feature for Emscripten / web targets
Web LLM API: cancelProcessing() call for early-exiting during decoding. Will not work until next npm update.
Fix attachments synchronization for non-WEB use cases.
Fix for MPMask f32-->u8 conversions not always working consistently on all devices. See #5862.

Python

Add prompt_prefix/suffix_system params to ModelConfig
Use the C API for TextEmbedder
Add lora_alpha support in converter
Migrate FaceDetector to use the C API.
Add ctypes structures for Landmark and conversion
Move CategoryC to Category conversion to Category and add Categories conversion
Update BoundingBoxC and BoundingBox to avoid circular dependency
Add MatrixC ctypes structure for MediaPipe C API
Create shared conversion functions that can be used by FaceLandmarker
Migrate FaceLandmarker to use C API
Move keypoint conversion to Keypoint
Move DetectionC to Detection conversion to Detection and add test
Add Landmark.from_ctypes
Create Python Audio Data wrapper for C API
Update HandLandmarker to use the C API
Refactor Python AudioClassifier to use the MediaPipe Tasks C API.
Handle an unknown category index in the ctypes Category conversion
Migrate ObjectDetector to use C API
Migrate PoseLandmarker to use C API
Create a dispatch library that forwards all C API calls through a single thread
Migrate GestureRecognizers (Python) to use C API.
Migrate ImageClassifier to use C API.
Update MpStatus conversion to use more specific errors
Ensure all callbacks are invoked on a valid Python thread
Update Python tasks to use the new AsyncResultDispatcher
Improve Python ctypes handling of quantized embeddings.
Migrate MediaPipe Python ImageEmbedder to use the C API
Migrate Python InteractiveSegmenter to use the MediaPipe Tasks C API.
Integrate AsyncResultDispatcher into ImageSegmenter.
Add del methods to Python tasks
Make LLM Converter testable by moving logic to a class
Convert LLM Bundler to a C API
Removed old base class for TextTasks , old base class for AudioTasks , unused TextRecognizer , HolisticLandmarker from Python (for now) , Protobuf dependencies from MediaPipe Python Package and also Python Tasks runners.
Replace _pywrap_flatbuffers based Flatbuffers writer with a ctypes-based C API in metadata.py
Make TextEmbedderResult and GestureRecognizerResult public
Restore Python 3.9 compatibility for MediaPipe PIP package and allow file loading in OSS
Retain error messages in Text Classifier Python API
Update LLM Converter to retain error messages
Create Python Pipeline for the new PIP package
Improve Python Wheel on Mac and Windows
Create MP Tasks DrawingUtils
Replace the LLM Bundler Proto dependency with a C API
Re-add GPU support to Python API

MediaPipe Dependencies

Fix OpenCV dependency.
Update WASM files for 0.10.32 release

google-ai-edge/mediapipe v0.10.32 MediaPipe v0.10.32 on GitHub

Build changes

Bazel changes

MediaPipe Tasks update

Android

iOS

Javascript

Python

MediaPipe Dependencies

google-ai-edge/mediapipe v0.10.32
MediaPipe v0.10.32

on GitHub