github lightvector/KataGo v1.12.3
OpenCL and TensorRT Bugfixes

latest releases: v1.15.1, v1.15.0, v1.14.1...
18 months ago

If you're a new user, don't forget to check out this section for getting started and basic usage! If you don't know which version to choose (OpenCL, CUDA, TensorRT, Eigen, Eigen AVX2), read this: https://github.com/lightvector/KataGo#opencl-vs-cuda-vs-tensorrt-vs-eigen

Also, KataGo is continuing to improve at https://katagotraining.org/ and if you'd like to donate your spare GPU cycles and support it, it could use your help there!

Users of the TensorRT version upgrading to this version of KataGo will also need to upgrade from TensorRT 8.2 to TensorRT 8.5

As before, attached here are "bs29" versions of KataGo. These are just for fun, and don't support distributed training but DO support board sizes up to 29x29. They may also be slower and will use much more memory, even when only playing on 19x19, so you should use them only when you really want to try large boards.

The Linux executables were compiled on an old 18.04 Ubuntu machine. As with older releases, they might not work, and it may be more reliable to build KataGo from source yourself, which fortunately is usually not so hard on Linux (https://github.com/lightvector/KataGo/blob/master/Compiling.md).

New Neural Net Architecture Support (release series v1.12.x)

The same as prior releases in the v1.12.x series, this release KataGo has recently-added support a new neural net architecture! See the release notes for v1.12.0 for details! The new neural net, "b18c384nbt" is also attached in this release for convenience, for general analysis use it should be similar in quality to recent 60-block models, but run significantly faster due to being a smaller net.

What's Changed in v1.12.3

This specific release v.1.12.3 fixes a few additional bugs in KataGo:

  • Fixes performance regression for some GPUs on TensorRT that was introduced along with v.1.12.x (thanks @hyln9 !) (#741)
  • Mitigates a long-standing performance bug on OpenCL, where on GPUs that used dynamic boost or dynamic clock speeds, the GPU tuner would not get accurate timings due to variable GPU clock speed, most notably on a few users machines causing the tuner to fail to select FP16 tensor cores even when the GPU supported them and they would be much better performance. Most users will not see an improvement, but a few may see a large improvement. The fix is to add some additional computation to the GPU during tuning so that it is less likely to reduce its clock speed. (#743)
  • Fixes an issue where depending on settings, in GTP or analysis Katago might fail to treat two consecutive passes as ending the game within its search tree.
  • Fixes an issue in the pytorch training code that prevented models from being easily trained on variable tensor sizes (i.e. max board sizes) in the data.
  • Contribute command in OpenCL will now also pretune for the new b18c384nbt architecture the same way pretunes for all other models.

Don't miss a new KataGo release

NewReleases is sending notifications on new releases.