Build changes
- Enables ml drift metal delegate as inference calculator backend.
- [mediapipe] support armv7 (32 bit in mediapipe tasks)
- Do not assume canvas is BGRA in RenderToWebGpuCanvas.
- Fix sampling logic in ImageToTensorConverterWebGpu.
- Migrate GlShaderCalculator to API3.
- Migrate gl_shader_calculator_test to use API3 builder.
Bazel changes
- [mediapipe] verion bump to 0.10.27
- Dawn has completed these changes, so the old paths are no longer used.
- Integrate tiny Juno inpainting graph into GenAiProcessor
- Readme for API3
- Add Resources::ResolveId to enable placeholder resource ids usage.
- Web LLM: a few more small edits for Gemma3n
- Include headers from global namespace
- Add comment to Eigen version in WORKSPACE to remind about synchronization with TensorFlow's Eigen dependency
- Migrating VisibilityCopyCalculator to API3.
- Update from Bazel v6.5.0 to v7.4.1, Protobuf v3.19.1 to v5.28.3. Other packages also update the version within WORKSPACE.
- Fix for weight cache on Windows.
- Create Selfie Segmentation Demo App for LiteRT NPU.
- Add Any support for API3
- Adding AudioBuffer support to web LLM Inference API to handle more audio input types for MM models
- Provide API3 interface for PassThroughCalculator using newly added Any type.
- Migrate MergeCalculator to API3 and newly introduced Any type.
- pybind11 version and py_proto_library macro update.
- Add test for PacketResamplerCalculator with a very short video.
- Initial version of sync function runner for API3
- Fix function runner error reporting.
- [mediapipe] version bump
- Migrate CombinedPredictionCalculator to API3
- Clean up CombinedPredictionCalculator
- Currently, wrapping a TextureFrame in a media-pipe Packet assumes the texture is 8-bit RGBA. This patch allows specifying other texture formats to support common color formats like RGBA16F for HDR content.
- Support timestamp bound updates in function runner.
- Migrate TensorsToSegmentationCalculator to MediaPipe API3.
- Add OneOf support for API3.
- Provide ineference calculator API3 interface.
- Migrate LandmarksToMatrixCalculator to API3
- Update MediaPipe OSS to C++20.
- Add a flag to use
fp16activations in tests. - Migrate HandednessToMatrixCalculator to API3.
- Update
xnnpackversion. - Use the new
xnn_reduce_mean_squaredreduction for the RMSNorm. - Migrate ImageToTensorCalculator to API3.
- Consistently use MutexLock instead of manual locking/unlocking
- Add ImageProcessingOptions to FaceDetector C API
- Enable node names as compile time strings in OSS.
- Migrate API3 nodes to use compile time string names.
- Fall back to producer context in gpu_buffer.GetReadView
- Document api3 GetOrDie / VisitOrDie
- Update log for missing InferenceCalculatorXnnpack registration.
- Add NodeName for non-generic calculator context.
- Add
ImageProcessingOptionssupport to FaceLandmarker C API. - Migrate FaceLandmarker C API to use MediaPipe Image
- Update CombinedPredictionCalculator test to new Runner
- Fix comment about when things die.
- Migrate WebGpuShaderCalculator to MediaPipe API3.
- Proto changes for Tiny Gemma on ml_drift
- Enable API3 FunctionRunner for WEB
- Bump MediaPipe version to 0.10.29.
- Add CompareAndSaveImageOutputDynamic to compare to a dynamic golden instead of a file.
- Bump MediaPipe version to 0.10.30.
- Improve error message of graph validation, to include node calculator name
- Refactor Hand Landmarker C API to use new MP Image
- Add ExternalGlTextureSyncMode to require efficient synchronization.
- Add an option to get a Packet for API3 OneOf input.
- Migrate GpuBufferToImageFrameCalculator to API3.
- Add support to pass a single visitor in VisitOrDie for OneOf inputs.
- Add VisitAsPacketOrDie for OneOf inputs.
- Add MediaPipe Tasks C API for AudioClassifier.
- Ensure correct type of #api3 Packet.
- Support fractional frame rates in MediaPipe video processing.
- Add
ImageProcessingOptionssupport to Object Detector C API. - Add
ImageProcessingOptionssupport to MediaPipe PoseLandmarker C API. - Update object detector to apply ImageFrame C API
- Update pose landmarker to apply ImageFrame C API
- Migrate HandAssociationCalculator to MediaPipe API3.
- Adding ImageProcesingOptions to image_classifier C API.
- Adding ImageProcesingOptions to gesture_recognizer C API.
- Migrate GestureRecognizer (C API) to MpImagePtr
- Migrate ImageFrameToGpuBufferCalculator to API3
- Set default thread_num to LiteRT::CPU delegate
- Qualify Packet and MakePacket while in mediapipe::api3 namespace to avoid future collisions with api3::Packet / api3::MakePacket
- Remove redundant empty parentheses from lambdas in MediaPipe API3.
- Migrate ImageClassifier (C API) to MpImagePtr.
- Update HandAssociationCalculatorTest to new Runner
- Add
ImageProcessingOptionssupport to Image Segmenter C API - Migrate TensorsToSegmentationCalculator to API3
- Extend lifetime of Image data when MpImage is constructed from an existing MediaPipe Image
- Allow GetData() calls for contiginuous images
- Migrate ImageSegmenter C API to MpImagePtr
- Add GetLabels() to the ImageSegmenter C API
- Generalize UnpackMediaSequenceCalculator's support for encoded media streams.
- Clean up unused variables
- Get rid of _with_options in favor of optional param (C & Python API)
- Migrate Python ImageSegmenter to C API
- Refactor Image Embedder C API to use MpImage
- Simplify FunctionRunner template types.
- Simplify setting options in
HandAssociationCalculatorTest - Update MediaPipe C API vision task result callbacks to use
MpStatus. - Offer attachments functionality from WebGPU service.
- Enable creation of WebGPU service from explicitly provided wgpu::Device.
- Remove unnecessary checks.
- Enables RGBA input with RGB output.
- Modify the Has{} generic function to check across the different value kinds of
- Refactor TextClassifier C API to use MpStatus.
- Update C API for TextEmbedder to use MpStatus
- Update C API for LanguageDetector to return MpStatus
- Add ImageProcessingOptions support to the MediaPipe ImageEmbedder C API.
- Add ImageProcessingOptions to InteractiveSegmenter C API and migrate to MpImage
- Fix counting pixels with different colors for image comparison tests
- Remove redundant
has_confidence_masksfield fromImageSegmenterResult. - Migrate ImageClassifier C API to use MpStatus.
- Refactor Face Detector C API to use
MpStatus. - Update ImageSegmenter C API to return MP Status
- Add support for XNNPACK's SLOW_CONSISTENT_ARITHMETIC flag
- Update ImageEmbedder C API to return MP Status
- Update InteractiveSegmenter C API to return MP Status
- Get rid of _with_options in favor of optional param (C)
- Refactor Gesture Recognizer C API to use MpStatus.
- Update ObjectDetector C API to return MpStatus
- Update HandLandmarker C API to return MP Status
- Update PoseLandmarker C API to return MpStatus
- Fix libmediapipe.so compilation on Windows
- Remove no longer used Image types
- Add side packets support for FunctionRunner.
- Fix documentation.
- Update bot assignees in bot_config.yml.
- Added new R8 mode for GlShaderCalculator
- Add visibility declarations for Windows
- Centralize WebGPU header includes
- Web Solutions: patch for importScripts error with modules in workers
- Small cleanups in MP Task C++ segmentation graphs and ModelTaskGraph
- Update AudioClassifier to retain error messages
- Retain error messages in the Metadata API
- Retain error messages in Language Detector
- Update GestureRecognizer to retain error messages
- Update FaceLandmarker C API to use new naming and return type convention
- Retain error messages in TextEmbedder
- Retain error messages in HandLandmarker
- Retain error messages in ImageEmbedder
- Retain Error Messages in Object Detector
- Small cleanup of scheduler_queue
- Refactor ImageClassifier C API to return error messages.
- Allow empty Tensors in InferenceCalculator
- Update ImageSegmenter to retain error messages
- Retain error messages for PoseLandmarker
- Retain error messages in MpImage
- Allow packing of input streams into empty SequenceExample.
- Retain error messages in Interactive Segmenter
- Replace custom test macros with EXPECT_EQ/ASSERT_EQ
- Add MpErrorFree to avoid missing function on Windows
- Bump MP version to 0.10.31
- Add Kotlin support to MediaPipe OSS repo
- Prepare for functiongemma with MP web LLM API
- Add experimental mapSync support in GetTexture2dData.
- Fail with error at empty decoded image in OpenCVEncodedImageToImageFrameCalculator. Empty decoded image can happen if decoding fails.
- Support option dependencies in mediapipe proto rules
- Fix logging large one-dimensional vertical data
- Adds GPU output support for category masks (copy only, result listener zero-copy case is not addressed yet)
- Add wgpu::ExternalTexture support
MediaPipe Tasks update
This section should highlight the changes that are done specifically for any platform and don't propagate to
other platforms.
Android
- Allow users to configure NPU delegate
- Remove all references to subgraph reshaping which is enabled by default
- Don't swallow Task exceptions for synchronous use cases
- Update score thresholds for Java classifier/embedder tests
- Restore default .so location
- Add RegionOfInterest Proto to Java Protobuf list
- Don't assume images are RGB
- Allow users to configure the NPU delegate
iOS
- Expose preferredBackends
- Add stream cancellation support in swift API.
- Add audio modality support to iOS GenAI inference.
- Use wrapper type for RenderData
- Remove MP AudioEmbedder
Javascript
- Web LLM: basic .wav audio support for Gemma 3N
- Web LLM: Small fix to multimodal error message strings
- Add a test case to test empty packet inputs.
- Migrate tasks tests to use common image test util.
- Enables GraphRuntimeInfo feature for Emscripten / web targets
- Web LLM API: cancelProcessing() call for early-exiting during decoding. Will not work until next npm update.
- Fix attachments synchronization for non-WEB use cases.
- Fix for MPMask f32-->u8 conversions not always working consistently on all devices. See #5862.
Python
- Add prompt_prefix/suffix_system params to ModelConfig
- Use the C API for TextEmbedder
- Add lora_alpha support in converter
- Migrate FaceDetector to use the C API.
- Add ctypes structures for Landmark and conversion
- Move CategoryC to Category conversion to Category and add Categories conversion
- Update BoundingBoxC and BoundingBox to avoid circular dependency
- Add
MatrixCctypes structure for MediaPipe C API - Create shared conversion functions that can be used by FaceLandmarker
- Migrate FaceLandmarker to use C API
- Move keypoint conversion to Keypoint
- Move DetectionC to Detection conversion to Detection and add test
- Add Landmark.from_ctypes
- Create Python Audio Data wrapper for C API
- Update HandLandmarker to use the C API
- Refactor Python AudioClassifier to use the MediaPipe Tasks C API.
- Handle an unknown category index in the ctypes Category conversion
- Migrate ObjectDetector to use C API
- Migrate PoseLandmarker to use C API
- Create a dispatch library that forwards all C API calls through a single thread
- Migrate GestureRecognizers (Python) to use C API.
- Migrate ImageClassifier to use C API.
- Update MpStatus conversion to use more specific errors
- Ensure all callbacks are invoked on a valid Python thread
- Update Python tasks to use the new AsyncResultDispatcher
- Improve Python ctypes handling of quantized embeddings.
- Migrate MediaPipe Python ImageEmbedder to use the C API
- Migrate Python InteractiveSegmenter to use the MediaPipe Tasks C API.
- Integrate AsyncResultDispatcher into ImageSegmenter.
- Add del methods to Python tasks
- Make LLM Converter testable by moving logic to a class
- Convert LLM Bundler to a C API
- Removed old base class for TextTasks , old base class for AudioTasks , unused TextRecognizer , HolisticLandmarker from Python (for now) , Protobuf dependencies from MediaPipe Python Package and also Python Tasks runners.
- Replace _pywrap_flatbuffers based Flatbuffers writer with a ctypes-based C API in metadata.py
- Make TextEmbedderResult and GestureRecognizerResult public
- Restore Python 3.9 compatibility for MediaPipe PIP package and allow file loading in OSS
- Retain error messages in Text Classifier Python API
- Update LLM Converter to retain error messages
- Create Python Pipeline for the new PIP package
- Improve Python Wheel on Mac and Windows
- Create MP Tasks DrawingUtils
- Replace the LLM Bundler Proto dependency with a C API
- Re-add GPU support to Python API
MediaPipe Dependencies
- Fix OpenCV dependency.
- Update WASM files for 0.10.32 release