This is a patch release containing the following changes to v3.1:
- Fixed correctness issue in pooling primitive with post-ops on Intel GPUs (4b7bc1a)
- Fixed segfault in
bfloat16
convolution on processors with Intel AMX support (461d55e) - Fixed correctness issue in deconvolution primitive with post-ops on Intel GPUs based on Xe-LP architecture (c8943f5, ad3c62f)
- Fixed performance regression in
int8
convolution primitive with scales (7fa3b6f, bb3ecc4) - Fixed correctness issue in
int8
convolution primitive with zero points on processors with Intel AVX2 and Intel DL Boost support (d721767, f6365b1) - Fixed performance regression in
int8
inner product on processors with Intel AVX-512 and Intel DL Boost or Intel AMX support (2ede31e) - Fixed segfault in pooling primitive with post-ops on processors with Intel SSE4.1 support (d712173, e4085a7)
- Fixed integer overflow in eltwise primitive on Intel GPUs (1932b3d, be05c33, 148006b, 2e64369, b4423fb, 87fd48f, 9a66ac6, 6ce52eb, 36bf079, 161d2b6, a5ef078, d058bd8)
- Fixed primitive creation error in large 3D convolutions on Intel GPUs (7c23d9e)
- Fixed performance regression in
fp32
convolution primitive weight gradient on Intel GPUs (ff209f9, 8710839) - Fixed primitive creation error in
int8
convolution with zero points on Intel GPUs (cb91693, 85e58af) - Fixed correctness issue in
fp32
convolution with Winograd algorithm on Intel GPUs (97ac885) - Fixed primitive creation error in depthwise convolution on Intel GPUs based on Xe-LP architecture (51d608d)
- Fixed segfault during Graph partition compilation (a5d3568)
- Fixed crashes in inner product with unsupported weight formats on Intel64 CPUs (c0f4e93)
- Fixed an issue with compilation of Graph partitions containing matmul and using destination tensor layout
any
on Intel GPUs (ab2041d, f2c457d) - Improved accuracy of eltwise primitive with
gelu_erf
algorithm on Intel64 CPUs (e67abef) - Fixed correctness issue in
int8
matmul and inner product primitives on Intel GPUs based on Xe-HPG and Xe-HPC architecture (36aa622) - Fixed potential correctness issue in
bfloat16
convolution weight gradient on processors with Intel AMX support (c93e673, 8da1083, f7acf98) - Fixed memory corruption in inner product weight gradient on processors with Intel AMX support (b56a89e)
- Fixed integer overflow issue in convolution primitive on Intel GPUs (774deab, 663c2e4, 12d5743, 31ac0e0, e3cb07d)
- Fixed correctness issue in matmul primitive with broadcasted bias on Intel GPUs (3ba7e8b)
- Fixed correctness issue in inner product primitive with post-ops on processors with Intel AVX2 support (69260f6)
- Fixed out of bounds prefetching in matmul and inner product primitives on Intel GPUs (2b8f6b1)
- Fixed dispatching issues for
fp32
inner product implementation on processors with Intel AVX2 and Intel DL Boost supprt (f27dedb, f8d7c2e) - Fixed division by zero issue in eltwise and eltwise post-op on Intel GPUs (f5654f5, a18c19e, a7c8cbc, 44355a6)
- Fixed correctness issue for 3D convolution primitive with post-ops (e6b93af)