oneapi-src/oneDNN v3.1.1 on GitHub

This is a patch release containing the following changes to v3.1:

Fixed correctness issue in pooling primitive with post-ops on Intel GPUs (4b7bc1a)
Fixed segfault in bfloat16 convolution on processors with Intel AMX support (461d55e)
Fixed correctness issue in deconvolution primitive with post-ops on Intel GPUs based on Xe-LP architecture (c8943f5, ad3c62f)
Fixed performance regression in int8 convolution primitive with scales (7fa3b6f, bb3ecc4)
Fixed correctness issue in int8 convolution primitive with zero points on processors with Intel AVX2 and Intel DL Boost support (d721767, f6365b1)
Fixed performance regression in int8 inner product on processors with Intel AVX-512 and Intel DL Boost or Intel AMX support (2ede31e)
Fixed segfault in pooling primitive with post-ops on processors with Intel SSE4.1 support (d712173, e4085a7)
Fixed integer overflow in eltwise primitive on Intel GPUs (1932b3d, be05c33, 148006b, 2e64369, b4423fb, 87fd48f, 9a66ac6, 6ce52eb, 36bf079, 161d2b6, a5ef078, d058bd8)
Fixed primitive creation error in large 3D convolutions on Intel GPUs (7c23d9e)
Fixed performance regression in fp32 convolution primitive weight gradient on Intel GPUs (ff209f9, 8710839)
Fixed primitive creation error in int8 convolution with zero points on Intel GPUs (cb91693, 85e58af)
Fixed correctness issue in fp32 convolution with Winograd algorithm on Intel GPUs (97ac885)
Fixed primitive creation error in depthwise convolution on Intel GPUs based on Xe-LP architecture (51d608d)
Fixed segfault during Graph partition compilation (a5d3568)
Fixed crashes in inner product with unsupported weight formats on Intel64 CPUs (c0f4e93)
Fixed an issue with compilation of Graph partitions containing matmul and using destination tensor layout any on Intel GPUs (ab2041d, f2c457d)
Improved accuracy of eltwise primitive with gelu_erf algorithm on Intel64 CPUs (e67abef)
Fixed correctness issue in int8 matmul and inner product primitives on Intel GPUs based on Xe-HPG and Xe-HPC architecture (36aa622)
Fixed potential correctness issue in bfloat16 convolution weight gradient on processors with Intel AMX support (c93e673, 8da1083, f7acf98)
Fixed memory corruption in inner product weight gradient on processors with Intel AMX support (b56a89e)
Fixed integer overflow issue in convolution primitive on Intel GPUs (774deab, 663c2e4, 12d5743, 31ac0e0, e3cb07d)
Fixed correctness issue in matmul primitive with broadcasted bias on Intel GPUs (3ba7e8b)
Fixed correctness issue in inner product primitive with post-ops on processors with Intel AVX2 support (69260f6)
Fixed out of bounds prefetching in matmul and inner product primitives on Intel GPUs (2b8f6b1)
Fixed dispatching issues for fp32 inner product implementation on processors with Intel AVX2 and Intel DL Boost supprt (f27dedb, f8d7c2e)
Fixed division by zero issue in eltwise and eltwise post-op on Intel GPUs (f5654f5, a18c19e, a7c8cbc, 44355a6)
Fixed correctness issue for 3D convolution primitive with post-ops (e6b93af)