uxlfoundation/oneDNN v2.7.1 on GitHub

This is a patch release containing the following changes to v2.7:

Fixed performance regression for batch normalization primitive in TBB and threadpool configurations (cd953e4)
Improved grouped convolution performance on Xe Architecture GPUs (d7a781e, cb1f3fe, 4e84474, 7ba3c40)
Fixed runtime error in int8 reorder on Intel GPUs (53532a9)
Reverted MEMFD allocator in Xbyak to avoid segfaults in high load scenarios (3e29ae2)
Fixed a defect with incorrect caching of BRGEMM-based matmul primitive implementations with trivial dimensions (87cd979)
Improved depthwise convolution performance with per-tensor binary post-ops for Intel CPUs (f430a5a)
Extended threadpool API to manage maximum concurrency (8a1e959, 64e5594)
Fixed potential integer overflow in BRGEMM-based convolution implementation (25ccee3)
Fixed performance regression in concat primitive with any format on Intel CPUs (2a60ade, feb614d)
Fixed compile-time warnings in matmul_perf example (b5faa77)
Fixed 'insufficient registers in requested bundle' runtime error in convolution primitive on Xe Architecture GPUs (4c9d46a)
Addressed performance regression for certain convolution cases on Xe Architecture GPUs (f28b58a, 18764fb)
Added support for Intel DPC++/C++ Compiler 2023 (c3781c6, a1a8952, 9bc87e6, e3b1987)
Fixed int8 matmul and inner product performance regression on Xe Architecture GPUs (3693fbf, c8adc17)
Fixed accuracy issue for convolution, inner product and matmul primitives with tanh post-op (88b4e57, 83ce6d2, 6224dc6, 10f0d0a)
Suppressed spurious build warnings with GCC 11 (44255a8)