This release is correcting the way the assembly code generator is called as well as correctly implementing the code wrapping the generated assembly code paths. Moreover, the dispatch mechanism using a direct lookup table is now correctly covering the possible problem space. At the same time, the direct lookup table has been effectively limited in size (M x N x K <= 65536) such that the table does not exceed 512 KB (64-bit architecture). The fall-back dispatch mechanism remains to be based on the binary search which does not suffer from the size issue. Feature-wise, the assembly code generator has been enabled by default. Moreover, an additional index generator scheme has been implemented and documented (MNK variable).