github lightvector/KataGo v1.10.0
TensorRT Backend, Many Minor Improvements

latest releases: v1.14.1, v1.14.0, v1.13.2-kata9x9...
2 years ago

If you're a new user, don't forget to check out this section for getting started and basic usage! If you don't know which version to choose (OpenCL, CUDA, TensorRT Eigen, Eigen AVX2), read this: https://github.com/lightvector/KataGo#opencl-vs-cuda-vs-tensorrt-vs-eigen

Also, KataGo is continuing to improve at https://katagotraining.org/ and if you'd like to donate your spare GPU cycles and support it, it could use your help there!

As before, attached here are "bs29" versions of KataGo. These are just for fun, and don't support distributed training but DO support board sizes up to 29x29. They may also be slower and will use much more memory, even when only playing on 19x19, so you should use them only when you really want to try large boards.

New TensorRT Backend

There is a new TensorRT backend ("trt8.2") in this release thanks to some excellent work by hyln9! On strong NVIDIA GPUs, this backend can often be 1.5x the speed of any other backend. It is NOT universally faster however, sometimes the CUDA backend can still be faster than the TensorRT backend. The two backends may also prefer different numbers of threads - try running the benchmark to see. TensorRT also tends to take noticeably longer to start up.

Using TensorRT requires an NVIDIA GPU and CUDA 11.1+ and CUDNN 8.2+ and TensorRT 8.2 (precompiled executables in this release use CUDA 11.1 for linux and CUDA 11.2 for Windows) which you can download and install manually from NVIDIA: https://developer.nvidia.com/tensorrt, https://developer.nvidia.com/cuda-toolkit, https://developer.nvidia.com/cudnn.

If you want an easier out-of-the-box setup and/or are using other GPUs, then OpenCL is still recommended as the easiest to get working.

Minor Features and Improvements

  • KataGo antimirror logic for GTP is slightly improved.
  • Analysis engine and kata-analyze now support reporting the standard deviation of ownership across search ("ownershipStdev")
  • Added minor options for random high-temperature policy initialization to katago match command.
  • Very slight cross-backend performance improvement - most configurations by default will now avoid multi-board-size GPU masking code if only one board size is used. (Analysis engine is the one major exception, you must specify requireMaxBoardSize, maxBoardXSizeForNNBuffer, maxBoardYSizeForNNBuffer in the config and then must not query for other board sizes).
  • Added the code used to generate the books at https://katagobooks.org/, runnable by ./katago genbook with example config at https://github.com/lightvector/KataGo/blob/master/cpp/configs/book/genbook7jp.cfg. You can generate your own books if you like, although be prepared to dive into the source code if you want to know exactly what particular parameters do.

Bugfixes

  • KataGo should now (hopefully) handle non-ascii file paths on Windows.
  • GTP/Analysis "avoid" option now correctly applies when there is only 1-playout and moves are based on raw policy.
  • GTP/Analysis "avoid" option now correctly interacts with root symmetry pruning.
  • Fixed various bugs with GTP command loadsgf
  • Fixed minor issue reporting analysis values for terminal positions.
  • Fixed issue where during multithreading analysis would report zero-visit moves with weird stats.
  • Fix minor possible race if multiple katago distributed training contributes are started at once on the same machine.
  • More-reliably tolerate and retry corrupted downloads in contribute command for online distributed training
  • Benchmark now respects defaultBoardSize in config.
  • Fixed issue in cmake build setup with mingw in Windows.
  • Fixed issue with swa_model namespace when loading a preexisting model for train.py for model training.

Don't miss a new KataGo release

NewReleases is sending notifications on new releases.