- speedup x16r/x16s in some cases on final step
- fixed regress for simd, so should speedup x16r/x16s and return speed for other algos(bcd, sonoa, etc.)
- added hex algo
- now intensity can be set using sgminer-like numbers(old one supported too)
- improved API a bit, now threads contains hashrate per GPU, not per thread
- fixed GPU numbering at start when --opencl-threads used