This is a patch release containing the following changes to v3.10:
- Fixed an issue with reorder primitive returning
unimplementedfor cases when only one scale mask is defined on AArch64 processors (be92457) - Fixed sporadic correctness issue in
fp32matmul on Intel GPUs based on Xe2 architecture (b4a761c) - Fixed correctness issue in
fp16/bf16matmul on Intel GPUs based on Xe3 architecture (48c114b) - Fixed performance regression in
bf16convolution weight gradient on Intel Arc Graphics B-series (3b6665b) - Improved convolution performance on AArch64 processors with SVE128 support (808227d)
- Fixed regression in matmul primitive creation time on Intel GPUs (599ecb5)
- Fixed potential overflow for matmul, convolution and inner product primitives with Arm Compute Library (be12d8c)
- Fixed convolution performance regression on Intel Arc Graphics B-series (7e27159)