Feature:
- Support ION buffer on APU v4 and support input is float
- Auto signing libhexagon_nn_skel.so inside
- Remove op module when do not use cpu or gpu
- Supports boost and preference hints for APU
- Support build apu mace_run with no device connected
- Add dsp soc id 450
- Support fake warmup for OpenCL to speed up GPU warmup
- Add Qnn Backend and update qnn library
- Add special models to CI and Micro runtime_load_model example
- Support opencl3.0
- Support mtk ion mode
- Support dma_buf_heap
- Remove fallbacks caused by Reshape
- Add run validation for MACE-Micro
- Add MACE-Micro runtime load model interface
- Update MTK APU lib
Operator:
- Support sigmoid uint8 mode
- Support DepthToSpace, SpaceToDepth, ReduceSum and DetectionOutput operator
- Support depthwise_deconv2d host configuration
- Add keras converter supported ops
- Support InstanceNorm operator and fold InstanceNorm from TensorFlow
- Supports depth_to_space CRD mode
- Support dsp op: leaky relu, reshape
- Support htp op: depthwise_deconv, leaky_relu
- Support keras op: substract, multiply
- Support op: HardSigmoid
Performance:
- Optimize cpu op pooling and softmax performance
- Optimize Softmax on GPU and support GPU Reduce on channel dimension
Other
- Fix some compatibility and stability bugs
- Fix some document error
- Add some convert bug