uxlfoundation/oneDNN v3.8.2 on GitHub

This is a patch release containing the following changes to v3.8.1:

Fixed performance regression for f32 convolution primitive on processors with Intel AVX-512 instruction set support (5f3af68)
Introduced support for f16 destination in int8 matmul and int8 inner product on x64 CPUs (53fd12a, 22e252c, f5b2d7f, e4e2f1c)
Improved RNN primitive performance on processors with Intel AVX2 instruction set support (71e5d81, eb27db2, dd4e627, ff134e0, 5a86c1f, e9395ae)
Improved fp32 matmul performance on processors with Intel AVX-512 instruction set support (1119339)
Fixed segmentation fault in f32 binary primitive with broadcast on x64 processors (2082e98)
Fixed correctness issue in f64 convolution weight gradient with bias on Intel Arc GPUs (a00bfab)
Updated spdlog component to version 1.15.3 (dbb3629)
Fixed potential undefined behavior in convolution on Intel GPUs (5ac3e31)
Fixed segmentation fault in convolution implementation with trivial filter on Intel CPUs (908c5fc, f0a0eee)
Fixed segmentation fault in f16 convolution with odd dimensions on processors with Intel AVX10.1 instruction set support (78d6835)
Improved convolution primitive descriptor creation time on x64 processors (e9c5366, fd9dc58, f1d038e)
Fixed performance regression in f16 matmul with int4 weights on Intel Arc Graphics B-series (38d761b)
Improved bf16 matmul performance on processors with Intel AMX instruction set support (0887aec)
Fixed correctness issue in f32 RNN primitive on processors with Intel AMX instruction set support (460a014)