This release uses a direct function pointer lookup for auto-dispatched matrix-matrix multiplications, and is introducing a SPARSITY build-time flag to optionally rely on a binary search (which allows for a compact/sparse lookup table). Beside of some tweaked code-unrolling in the Intrinsic code path, the sample/benchmark program employs more optimized parallelization settings. The library now also comes with an adjusted build system in order to support the upcoming assembly code generator (GENASM=1). While we are working on adding a public version of the assembly code generator, the library's build system is already interoperable.