Windows release build for xenia-project/xenia@6ee2e37.
[x64] Add AVX512 optimizations for OPCODE_VECTOR_COMPARE_UGT(Integer)
AVX512 has native unsigned integer comparisons instructions, removing
the need to XOR the most-significant-bit with a constant in memory to
use the signed comparison instructions. These instructions only write to
a k-mask register though and need an additional call to vpmovm2* to
turn the mask-register into a vector-mask register.
As of Icelake:
vpcmpu* is all L3/T1
vpmovm2d is L1/T0.33
vpmovm2{b,w} is L3/T0.33
As of Zen4:
vpcmpu* is all L3/T0.50
vpmovm2* is all L1/T0.25