google-ai-edge/mediapipe v0.10.5 on GitHub

Framework and core calculator improvements

Fix crash in SavePngTestOutput
Log stack traces for combined CalculatorGraph statuses
Add a GpuOrigin parameter to TensorConverterCalculator
Replace some size EXPECTs by ASSERTs
Add a support for label annotations (image/label/string and image/label/confidence). Also fixed some clang tidy issues.
Set confidence score of the bounding box label.
Add setGpuBufferVerticalFlip to GraphRunner TS API
Remove unsafe cast.
apply affine transform before drawing, in order to keep constant line width regardless of face cropping.
Migrate packet messages auto registration to rely on MEDIAPIPE_STATIC_REGISTRATOR_TEMPLATE
add end loop calculator for image size
Provide a way to disable static registration using MEDIAPIPE_DISABLE_STATIC_REGISTRATION
Header for callback_packet_calculator to allow dynamic registration for superusers
Support more GPU formats in tensor converter calculator.
Expose stream handlers in headers to allow dynamic registration for superusers
Expose tool calculators in headers to enable dynamic registration by superusers.
Dry-Run mode for static registration to make it easier to find all required static registrations
Fix MediaPipe build in Chromium.
Swap left and right hand labels.
Don't access "document" in WebWorker
Update PackMediaSequenceCalculator to support adding clip/media/id to the MediaSequence.
update pose rendering
Update the header information for EnsureMinimumDefaultExecutorStackSize.
Move stream API loopback to third_party.
Add pose landmarks constants
Add an API in model_task_graph to create or use cached model resources.
Move stream API image_size to third_party.
Add C++ converters for C Text Classifier API
Move stream API rect_transformation to third_party.
Change the image label input from Classification to Detection.
Update port includes with IWYU to fix clang warnings in code where corresponding ports are used.
New image test utilities and memory management fixes.
Add a custom op resolver for fused batch norm.
Improving throttling logs by providing a node info corresponding to a throttling stream.
Use ABSL_LOG in MediaPipe.
Remove reference pointer to prevent using a constant reference in the looped iteration variable
Remove unnecessary includes in threadpool_std_thread_impl.cc.
Make cache writes optional in InferenceCalculatorAdvancedGL
Update PackMediaSequenceCalculator to support setting clip/media/string, clip/media/confidence and clip/label/index.
Some spelling and grammar fixes in the comments.
Add notes/warnings for calculators which use dedicated GL contexts.
Remove video and stream model in face stylizer.
Move stream API landmarks_projection to third_party.
Remove video and streaming mode for face stylizer.
landmarks_to_detection stream utility function.
Ensure that C header don't import C++ types
Splitting GraphRunner into public API declared interfaces and private TS impls
Add option for nearest neighbor interpolation.
Fixes two issues with file handling on windows:
Remove uncoditional texture params reset to make float textures handled correctly.
fixes the non-unicode path of file_helpers on windows
Modifying tensor_to_vector_float_calculator to take in D_BFLOAT16 values
Don't define field in ExternalFileHandler that's not used on Windows.
Clean up TensorConverterCalculator flipping behavior
Fix win32 build break in mediapipe.

MediaPipe Tasks update

This section should highlight the changes that are done specifically for any platform and don't propagate to
other platforms.

Android

Adds option to use tensor_ahwb in Android vendor processes
Add output size as parameters in Java ImageSegmenter
Change SegmentationOptions.builder() to be public
ImageGenerator Java API
Provide API/options to show intermediate results and generating progress for Java Image Generator.
Set enableFlowLimiting to false since only Image model is supported for face stylizer.
Move loading tasks-vision-jni to individual vision task class

iOS

Added refactored iOS vision task runner sources
Removed convenience initializer from refactored MPPVisionTaskRunner
Updated iOS docs to use swift names in place of objective c names
Added gesture recognizer and hand landmarker to iOS vision framework
Fixed directory creation issues in build_ios_framework.sh
Changed delegate method to optional
Added iOS image segmenter implementation file
Updated image segmenter bazel target to add MPPImageSegmenter.mm
Renamed option in MPPImageSegmenterOptions
Updated iOS face detector to use refactored vision task runner
Updated iOS image classifier to use refactored vision task runner
Changed order of methods in MPPImageSegmenter.mm
Fixed method call in MPPImageSegmenter.mm
Updated face landmarker, gesture recognizer,hand landmarker,object detector to use refactored vision task runner
Replaced the old iOS vision task runner with the refactored task runner
Updated iOS gesture recognizer documentation to use Swift names
Updated iOS hand landmarker documentation to use swift names
Moved iOS MPPHandLandmark enum to MPPHandLandmarker.h
Fixes iOS hand landmarker connections

Javascript

vlog default executor and its config usage
Updates the runners to support wasm-style binary assets files, and allows their URLs to be explicitly specified as part of the WasmFileset.
Add 'types' to package.json
Add externs to js_library targets
Add API exports for MPMask and MPImage
Add Handedness to JS, C++ and Android API
Fix missing exports for FilesetResolver and static constants
Add exports to ImageSegmenterResult and InteractiveSegmenterResult

Python

Set the default running model to Image for face stylizer.

Bug fixes

Internal fixes

Model Maker changes

Add tensorflow-addons to model_maker requirements.txt
Change to add the w_avg latent code to style encoding before layer swapping. This is a bug in the previous code. Also set training=True for encoder since this affect the encoding performance.
add metadata writer into face stylizer.
Refactor text_classifier preprocessor to move away from using classifier_data_lib
Import image_util for using it in mediapipe face stylizer open sourcing.
Fix image_util shortcut import line
Change supported_ops to a Tuple instead of List to match the API definition.
Add a new from_image API to create face stylizer dataset from a single image. Also deprecate the from_folder API since we only support one-shot use case now.
Add an API to run inference with face stylizer TF model.
Check if the image contains valid face that can be aligned for stylization. If not, throw an exception for invalid input image. This is applied to both input stylized face and raw face.
Add allow_custom_ops to model_util.convert_to_tflite and enable custom ops for face stylizer.
MediaPipe Dependencies
Update WASM files for 10.5 release

google-ai-edge/mediapipe v0.10.5 MediaPipe v0.10.5 on GitHub

Framework and core calculator improvements

MediaPipe Tasks update

Android

iOS

Javascript

Python

Bug fixes

Model Maker changes

MediaPipe Dependencies

google-ai-edge/mediapipe v0.10.5
MediaPipe v0.10.5

on GitHub