Additional improvement for slow systems:
When we detect the BLAS backend, adjust its UCI settings to [MaxPrefetch = 0, MinibatchSize = 8] unless these are overridden in Nibbler's config.json. For many people, these settings are significantly - or even drastically - faster.
v1.0.8 only affects the BLAS (i.e. CPU) backend. No change for GPU systems.