github google-ai-edge/mediapipe v0.10.32
MediaPipe v0.10.32

18 hours ago

Build changes

  • Enables ml drift metal delegate as inference calculator backend.
  • [mediapipe] support armv7 (32 bit in mediapipe tasks)
  • Do not assume canvas is BGRA in RenderToWebGpuCanvas.
  • Fix sampling logic in ImageToTensorConverterWebGpu.
  • Migrate GlShaderCalculator to API3.
  • Migrate gl_shader_calculator_test to use API3 builder.

Bazel changes

  • [mediapipe] verion bump to 0.10.27
  • Dawn has completed these changes, so the old paths are no longer used.
  • Integrate tiny Juno inpainting graph into GenAiProcessor
  • Readme for API3
  • Add Resources::ResolveId to enable placeholder resource ids usage.
  • Web LLM: a few more small edits for Gemma3n
  • Include headers from global namespace
  • Add comment to Eigen version in WORKSPACE to remind about synchronization with TensorFlow's Eigen dependency
  • Migrating VisibilityCopyCalculator to API3.
  • Update from Bazel v6.5.0 to v7.4.1, Protobuf v3.19.1 to v5.28.3. Other packages also update the version within WORKSPACE.
  • Fix for weight cache on Windows.
  • Create Selfie Segmentation Demo App for LiteRT NPU.
  • Add Any support for API3
  • Adding AudioBuffer support to web LLM Inference API to handle more audio input types for MM models
  • Provide API3 interface for PassThroughCalculator using newly added Any type.
  • Migrate MergeCalculator to API3 and newly introduced Any type.
  • pybind11 version and py_proto_library macro update.
  • Add test for PacketResamplerCalculator with a very short video.
  • Initial version of sync function runner for API3
  • Fix function runner error reporting.
  • [mediapipe] version bump
  • Migrate CombinedPredictionCalculator to API3
  • Clean up CombinedPredictionCalculator
  • Currently, wrapping a TextureFrame in a media-pipe Packet assumes the texture is 8-bit RGBA. This patch allows specifying other texture formats to support common color formats like RGBA16F for HDR content.
  • Support timestamp bound updates in function runner.
  • Migrate TensorsToSegmentationCalculator to MediaPipe API3.
  • Add OneOf support for API3.
  • Provide ineference calculator API3 interface.
  • Migrate LandmarksToMatrixCalculator to API3
  • Update MediaPipe OSS to C++20.
  • Add a flag to use fp16 activations in tests.
  • Migrate HandednessToMatrixCalculator to API3.
  • Update xnnpack version.
  • Use the new xnn_reduce_mean_squared reduction for the RMSNorm.
  • Migrate ImageToTensorCalculator to API3.
  • Consistently use MutexLock instead of manual locking/unlocking
  • Add ImageProcessingOptions to FaceDetector C API
  • Enable node names as compile time strings in OSS.
  • Migrate API3 nodes to use compile time string names.
  • Fall back to producer context in gpu_buffer.GetReadView
  • Document api3 GetOrDie / VisitOrDie
  • Update log for missing InferenceCalculatorXnnpack registration.
  • Add NodeName for non-generic calculator context.
  • Add ImageProcessingOptions support to FaceLandmarker C API.
  • Migrate FaceLandmarker C API to use MediaPipe Image
  • Update CombinedPredictionCalculator test to new Runner
  • Fix comment about when things die.
  • Migrate WebGpuShaderCalculator to MediaPipe API3.
  • Proto changes for Tiny Gemma on ml_drift
  • Enable API3 FunctionRunner for WEB
  • Bump MediaPipe version to 0.10.29.
  • Add CompareAndSaveImageOutputDynamic to compare to a dynamic golden instead of a file.
  • Bump MediaPipe version to 0.10.30.
  • Improve error message of graph validation, to include node calculator name
  • Refactor Hand Landmarker C API to use new MP Image
  • Add ExternalGlTextureSyncMode to require efficient synchronization.
  • Add an option to get a Packet for API3 OneOf input.
  • Migrate GpuBufferToImageFrameCalculator to API3.
  • Add support to pass a single visitor in VisitOrDie for OneOf inputs.
  • Add VisitAsPacketOrDie for OneOf inputs.
  • Add MediaPipe Tasks C API for AudioClassifier.
  • Ensure correct type of #api3 Packet.
  • Support fractional frame rates in MediaPipe video processing.
  • Add ImageProcessingOptions support to Object Detector C API.
  • Add ImageProcessingOptions support to MediaPipe PoseLandmarker C API.
  • Update object detector to apply ImageFrame C API
  • Update pose landmarker to apply ImageFrame C API
  • Migrate HandAssociationCalculator to MediaPipe API3.
  • Adding ImageProcesingOptions to image_classifier C API.
  • Adding ImageProcesingOptions to gesture_recognizer C API.
  • Migrate GestureRecognizer (C API) to MpImagePtr
  • Migrate ImageFrameToGpuBufferCalculator to API3
  • Set default thread_num to LiteRT::CPU delegate
  • Qualify Packet and MakePacket while in mediapipe::api3 namespace to avoid future collisions with api3::Packet / api3::MakePacket
  • Remove redundant empty parentheses from lambdas in MediaPipe API3.
  • Migrate ImageClassifier (C API) to MpImagePtr.
  • Update HandAssociationCalculatorTest to new Runner
  • Add ImageProcessingOptions support to Image Segmenter C API
  • Migrate TensorsToSegmentationCalculator to API3
  • Extend lifetime of Image data when MpImage is constructed from an existing MediaPipe Image
  • Allow GetData() calls for contiginuous images
  • Migrate ImageSegmenter C API to MpImagePtr
  • Add GetLabels() to the ImageSegmenter C API
  • Generalize UnpackMediaSequenceCalculator's support for encoded media streams.
  • Clean up unused variables
  • Get rid of _with_options in favor of optional param (C & Python API)
  • Migrate Python ImageSegmenter to C API
  • Refactor Image Embedder C API to use MpImage
  • Simplify FunctionRunner template types.
  • Simplify setting options in HandAssociationCalculatorTest
  • Update MediaPipe C API vision task result callbacks to use MpStatus.
  • Offer attachments functionality from WebGPU service.
  • Enable creation of WebGPU service from explicitly provided wgpu::Device.
  • Remove unnecessary checks.
  • Enables RGBA input with RGB output.
  • Modify the Has{} generic function to check across the different value kinds of
  • Refactor TextClassifier C API to use MpStatus.
  • Update C API for TextEmbedder to use MpStatus
  • Update C API for LanguageDetector to return MpStatus
  • Add ImageProcessingOptions support to the MediaPipe ImageEmbedder C API.
  • Add ImageProcessingOptions to InteractiveSegmenter C API and migrate to MpImage
  • Fix counting pixels with different colors for image comparison tests
  • Remove redundant has_confidence_masks field from ImageSegmenterResult.
  • Migrate ImageClassifier C API to use MpStatus.
  • Refactor Face Detector C API to use MpStatus.
  • Update ImageSegmenter C API to return MP Status
  • Add support for XNNPACK's SLOW_CONSISTENT_ARITHMETIC flag
  • Update ImageEmbedder C API to return MP Status
  • Update InteractiveSegmenter C API to return MP Status
  • Get rid of _with_options in favor of optional param (C)
  • Refactor Gesture Recognizer C API to use MpStatus.
  • Update ObjectDetector C API to return MpStatus
  • Update HandLandmarker C API to return MP Status
  • Update PoseLandmarker C API to return MpStatus
  • Fix libmediapipe.so compilation on Windows
  • Remove no longer used Image types
  • Add side packets support for FunctionRunner.
  • Fix documentation.
  • Update bot assignees in bot_config.yml.
  • Added new R8 mode for GlShaderCalculator
  • Add visibility declarations for Windows
  • Centralize WebGPU header includes
  • Web Solutions: patch for importScripts error with modules in workers
  • Small cleanups in MP Task C++ segmentation graphs and ModelTaskGraph
  • Update AudioClassifier to retain error messages
  • Retain error messages in the Metadata API
  • Retain error messages in Language Detector
  • Update GestureRecognizer to retain error messages
  • Update FaceLandmarker C API to use new naming and return type convention
  • Retain error messages in TextEmbedder
  • Retain error messages in HandLandmarker
  • Retain error messages in ImageEmbedder
  • Retain Error Messages in Object Detector
  • Small cleanup of scheduler_queue
  • Refactor ImageClassifier C API to return error messages.
  • Allow empty Tensors in InferenceCalculator
  • Update ImageSegmenter to retain error messages
  • Retain error messages for PoseLandmarker
  • Retain error messages in MpImage
  • Allow packing of input streams into empty SequenceExample.
  • Retain error messages in Interactive Segmenter
  • Replace custom test macros with EXPECT_EQ/ASSERT_EQ
  • Add MpErrorFree to avoid missing function on Windows
  • Bump MP version to 0.10.31
  • Add Kotlin support to MediaPipe OSS repo
  • Prepare for functiongemma with MP web LLM API
  • Add experimental mapSync support in GetTexture2dData.
  • Fail with error at empty decoded image in OpenCVEncodedImageToImageFrameCalculator. Empty decoded image can happen if decoding fails.
  • Support option dependencies in mediapipe proto rules
  • Fix logging large one-dimensional vertical data
  • Adds GPU output support for category masks (copy only, result listener zero-copy case is not addressed yet)
  • Add wgpu::ExternalTexture support

MediaPipe Tasks update

This section should highlight the changes that are done specifically for any platform and don't propagate to
other platforms.

Android

  • Allow users to configure NPU delegate
  • Remove all references to subgraph reshaping which is enabled by default
  • Don't swallow Task exceptions for synchronous use cases
  • Update score thresholds for Java classifier/embedder tests
  • Restore default .so location
  • Add RegionOfInterest Proto to Java Protobuf list
  • Don't assume images are RGB
  • Allow users to configure the NPU delegate

iOS

  • Expose preferredBackends
  • Add stream cancellation support in swift API.
  • Add audio modality support to iOS GenAI inference.
  • Use wrapper type for RenderData
  • Remove MP AudioEmbedder

Javascript

  • Web LLM: basic .wav audio support for Gemma 3N
  • Web LLM: Small fix to multimodal error message strings
  • Add a test case to test empty packet inputs.
  • Migrate tasks tests to use common image test util.
  • Enables GraphRuntimeInfo feature for Emscripten / web targets
  • Web LLM API: cancelProcessing() call for early-exiting during decoding. Will not work until next npm update.
  • Fix attachments synchronization for non-WEB use cases.
  • Fix for MPMask f32-->u8 conversions not always working consistently on all devices. See #5862.

Python

  • Add prompt_prefix/suffix_system params to ModelConfig
  • Use the C API for TextEmbedder
  • Add lora_alpha support in converter
  • Migrate FaceDetector to use the C API.
  • Add ctypes structures for Landmark and conversion
  • Move CategoryC to Category conversion to Category and add Categories conversion
  • Update BoundingBoxC and BoundingBox to avoid circular dependency
  • Add MatrixC ctypes structure for MediaPipe C API
  • Create shared conversion functions that can be used by FaceLandmarker
  • Migrate FaceLandmarker to use C API
  • Move keypoint conversion to Keypoint
  • Move DetectionC to Detection conversion to Detection and add test
  • Add Landmark.from_ctypes
  • Create Python Audio Data wrapper for C API
  • Update HandLandmarker to use the C API
  • Refactor Python AudioClassifier to use the MediaPipe Tasks C API.
  • Handle an unknown category index in the ctypes Category conversion
  • Migrate ObjectDetector to use C API
  • Migrate PoseLandmarker to use C API
  • Create a dispatch library that forwards all C API calls through a single thread
  • Migrate GestureRecognizers (Python) to use C API.
  • Migrate ImageClassifier to use C API.
  • Update MpStatus conversion to use more specific errors
  • Ensure all callbacks are invoked on a valid Python thread
  • Update Python tasks to use the new AsyncResultDispatcher
  • Improve Python ctypes handling of quantized embeddings.
  • Migrate MediaPipe Python ImageEmbedder to use the C API
  • Migrate Python InteractiveSegmenter to use the MediaPipe Tasks C API.
  • Integrate AsyncResultDispatcher into ImageSegmenter.
  • Add del methods to Python tasks
  • Make LLM Converter testable by moving logic to a class
  • Convert LLM Bundler to a C API
  • Removed old base class for TextTasks , old base class for AudioTasks , unused TextRecognizer , HolisticLandmarker from Python (for now) , Protobuf dependencies from MediaPipe Python Package and also Python Tasks runners.
  • Replace _pywrap_flatbuffers based Flatbuffers writer with a ctypes-based C API in metadata.py
  • Make TextEmbedderResult and GestureRecognizerResult public
  • Restore Python 3.9 compatibility for MediaPipe PIP package and allow file loading in OSS
  • Retain error messages in Text Classifier Python API
  • Update LLM Converter to retain error messages
  • Create Python Pipeline for the new PIP package
  • Improve Python Wheel on Mac and Windows
  • Create MP Tasks DrawingUtils
  • Replace the LLM Bundler Proto dependency with a C API
  • Re-add GPU support to Python API

MediaPipe Dependencies

  • Fix OpenCV dependency.
  • Update WASM files for 0.10.32 release

Don't miss a new mediapipe release

NewReleases is sending notifications on new releases.