lightvector/KataGo v1.3.2 on GitHub

This release should be a significant OpenCL performance improvement for users without NVIDIA tensor core GPUs - namely anything less top-end than an RTX 20xx card or similar. For NVIDIA tensor-core-supporting GPUs, the CUDA version is likely to still be faster though. Also, many other fixes and a few missing features have been added.

NOTE: The new OpenCL implementation will need to re-tune itself again the first time you start this new version, so be patient on the first new startup and/or run it in the console the first time.

If you're a new user, don't forget to check out this section for getting started and basic usage.

New Neural Nets! Yay!

g170-b20c256x2-s1913382912-d435450331 ("g170 20 block s1.91G") - A new 20 block net that is yet another 115 Elo (+/- 30) stronger than the previous net. This should be the new strongest KataGo net!
g170e-b15c192-s1672170752-d466197061 ("g170e 15 block s1.67G") - This 15 block net is probably the last extended-training 15 block net that KataGo will be producing. It is probably about 20-50 Elo stronger than the previous one, which might put it about on par with ELF OpenGo v2 at equal playouts, for high hundreds or low thousands of playouts. Making it a very strong net, given that it is only 15 x 192 in size, and hopefully ideal for weaker to moderate-level hardware.

These are attached below. For all other currently-released g170 nets, they are here: https://d3dndmfyhecmj0.cloudfront.net/g170/neuralnets/index.html

Notable Changes in This Release

Much improved xgemm implementation for OpenCL version - overall OpenCL performance should be improved by 10%-50%, depending on your hardware and threads!
All options in the GPU-related sections of the GTP config are also now optional and have better defaults. KataGo will automatically choose a batch size, and on the CUDA version it will automatically detect what flavor of GPU you have and enable or disable FP16 accordingly. Multiple GPUs will not be used automatically however - if you want to let KataGo use a larger cache to be a little faster or to have it use multiple GPUs, or run into problems with the automatic FP16 choice - you can still override the defaults.
Benchmark's thread suggestion greatly improved (./katago benchmark -config GTP_CONFIG.cfg -model MODEL.txt.gz), based on some new test data. The old version was a too conservative particularly on very strong machines - the new one will be a bit more aggressive about recommending larger numbers of threads.
GTP commands final_status_list and final_score will now use KataGo's neural net to guess an evaluation of the position if invoked when the game is not over or not fully cleaned up. For Japanese rules games, such as if you're running it on KGS - this should make KataGo now able to score and mark dead stones in all common cases (I think)! The heuristics here may still be a bit rough however and could possibly behave weirdly in certain sekis, or there may be more basic issues since I haven't specifically gotten set up to test on KGS, so let me know if you run into issues.
GTP command fixed_handicap is now supported in KataGo.
A few new options for users running selfplay training - can now terminate train.py after a fixed number of epochs, can now terminate gatekeeper once it's done passing a net, can now disable autoreject of old nets.

EDIT: Reverted the automatic use of FP16 on the CUDA version for Pascal-architecture NVIDIA GPUs, when cudaUseFP16=auto. FP16 is a mild performance boost for many of these GPUs, but on some setups there might be a chance of just a precision loss for little gain, and also maybe could cause issues for some users. If you have a recentish but non-tensor-core GPU, you can try setting cudaUseFP16=true instead of cudaUseFP16=auto in your gtp config and benchmark it.
EDIT: And fixed some additional bugs in the GTP protocol regarding races between pondering and other commands, and some long-standing issues with handicap stone handling. Additionally, KataGo will now tolerate handicap being placed by alternating black moves and white passes at the start of a game.

lightvector/KataGo v1.3.2 OpenCL Major Speedup, Defaults, GTP and Other Fixes on GitHub

New Neural Nets! Yay!

Notable Changes in This Release

lightvector/KataGo v1.3.2
OpenCL Major Speedup, Defaults, GTP and Other Fixes

on GitHub