Starting from the v1.2.0 release, Taichi follows semantic versioning where regular releases cutting from master branch bumps MINOR version and PATCH version is only bumped when cherry-picking critial bug fixes.

Deprecation Notice

Indexing multi-dimensional ti.ndrange() with a single loop index will be disallowed in future releases.

Highlights

New features

Offline Cache

We introduced the offline cache on CPU and CUDA backends in v1.1.0. In this release, we support this feature on other backends, including Vulkan, OpenGL, and Metal.

If your code behaves abnormally, disable offline cache by setting the environment variable TI_OFFLINE_CACHE=0 or offline_cache=False in the ti.init() method call and file an issue with us on Taichi's GitHub repo.
See Offline cache for more information.

GDAR (Global Data Access Rule)

A checker is provided for detecting potential violations of global data access rules.

The checker only works in debug mode. To enable it, set debug=True when calling ti.init().
Set validation=True when using ti.ad.Tape() to validate the kernels captured by ti.ad.Tape().
If a violation occurs, the checker pinpoints the line of code breaking the rules.

For example:

import taichi as ti
ti.init(debug=True)

N = 5
x = ti.field(dtype=ti.f32, shape=N, needs_grad=True)
loss = ti.field(dtype=ti.f32, shape=(), needs_grad=True)
b = ti.field(dtype=ti.f32, shape=(), needs_grad=True)

@ti.kernel
def func_1():
    for i in range(N):
        loss[None] += x[i] * b[None]

@ti.kernel
def func_2():
    b[None] += 100

b[None] = 10
with ti.ad.Tape(loss, validation=True):
    func_1()
    func_2()

"""
taichi.lang.exception.TaichiAssertionError:
(kernel=func_2_c78_0) Breaks the global data access rule. Snode S10 is overwritten unexpectedly.
File "across_kernel.py", line 16, in func_2:
    b[None] += 100
    ^^^^^^^^^^^^^^
"""

Improvements

Performance

Improved Vulkan performance with loops (#6072) (by Lin Jiang)

Python Frontend

PrefixSumExecutor is added to improve the performance of prefix-sum operations. The legacy prefix-sum function allocates auxiliary gpu buffers at every function call, which causes an obvious performance problem. The new PrefixSumExecutor is able to avoid allocating buffers again and again. For arrays with the same length, the PrefixSumExecutor only needs to be initialized once, then it is able to perform any number of times prefix-sum operations without redundant field allocations. The prefix-sum operation is only supported on CUDA backend currently. (#6132) (by Yu Zhang)
Usage:
```
N = 100
arr0 = ti.field(dtype, N)
arr1 = ti.field(dtype, N)
arr2 = ti.field(dtype, N)
arr3 = ti.field(dtype, N)
arr4 = ti.field(dtype, N)

# initialize arr0, arr1, arr2, arr3, arr4, ...
# ...

# Performing an inclusive in-place's parallel prefix sum,
# only one executor is needed for a specified sorting length.
executor = ti.algorithms.PrefixSumExecutor(N)
executor.run(arr0)
executor.run(arr1)
executor.run(arr2)
executor.run(arr3)
executor.run(arr4)
```
Runtime integer overflow detection on addition, subtraction, multiplication and shift left operators on Vulkan, CPU and CUDA backends is now available when debug mode is on. To use overflow detection on Vulkan backend, you need to enable printing, and the overflow detection of 64-bit multiplication on Vulkan backend requires NVIDIA driver 510 or higher. (#6178) (#6279) (by Lin Jiang)
For the following program:
```
import taichi as ti

ti.init(debug=True)

@ti.kernel
def add(a: ti.u64, b: ti.u64)->ti.u64:
    return a + b

add(2 ** 63, 2 ** 63)
  The following warning is printed at runtime:
Addition overflow detected in File "/home/lin/test/overflow.py", line 7, in add:
    return a + b
           ^^^^^
```
Printing is now supported on Vulkan backend on Unix/Windows platforms. To enable printing on vulkan backend, follow instructions at https://docs.taichi-lang.org/docs/master/debugging#applicable-backends (#6075) (by Ailing)

GGUI

Setting the initial position of GGUI window is now supported. Please refer to this link https://docs.taichi-lang.org/docs/master/ggui#create-a-window to checkout details and usage. (#6156) (by Mocki)

Taichi Examples

Three new examples from community contributors are also merged in this release. They include:

Animating the fundamental solution of a Laplacian equation, (#6249) (by @bismarckkk)
Animating the Kerman vortex street using LBM, (#6249) (by @hietwl)
Animating the two streams of instability (#6249) (by JiaoLuhuai)

You can view these examples by running ti example in terminal and select the corresponding index.

Important bug fixes

"ti.data_oriented" class instance now correctly releases its allocated memory upon garbage collection. (#6256) (by Zhanlue Yang)
"ti.fields" can now be correctly indexed using non-i32 typed indices. (#6276) (by Zhanlue Yang)
"ti.select" and "ti.ifte" can now be printed correctly in Taichi Kernels. (#6297) (by Zhanlue Yang)
Before this release, setting u64 arguments with numbers greater than 2^63 raises error, and u64 return values are treated as i64 in Python (integers greater than 2^63 are returned as negative numbers). This release fixed those two bugs. (#6267) (#6364) (by Lin Jiang)
Taichi now raises an error when the number of the loop variables does not match the dimension of the ndrange for loop instead of malfunctioning. (#6360) (by Lin Jiang)
calling ti.append with vector/matrix now throws more proper error message. (#6322) (by Ailing)
Division on unsigned integers now works properly on LLVM backends. (#6128) (by Yi Xu)
Operator ">>=" now works properly. (#6153) (by Yi Xu)
Numpy int is now allowed for SNode shape setting. (#6211) (by Yi Xu)
Dimension check for GlobalPtrStmt is now aware of whether it is a cell access. (#6275) (by Yi Xu)
Before this release, Taichi autodiff may fail in cases where the condition of an if statement depends on the index of a outer for-loop. The bug has been fixed in this release. (#6207) (by Mingrui Zhang)

Full changelog:

[Error] Deprecate ndrange with number of the loop variables != the dimension of the ndrange (#6422) (by Lin Jiang)
Adjust aot_demo.sh (by jim19930609)
[error] Warn Linux users about manylinux2014 build on startup i(#6416) (by Proton)
[misc] Bug fix (by jim19930609)
[misc] Bump version (by jim19930609)
[vulkan] [bug] Stop using the buffer device address feature on macOS (#6415) (by Yi Xu)
[Lang] [bug] Allow filling a field with Expr (#6391) (by Yi Xu)
[misc] Rc v1.2.0 cherry-pick PR number 2 (#6384) (by Zhanlue Yang)
[misc] Revert PR 6360 (#6386) (by Zhanlue Yang)
[misc] Rc v1.2.0 c1 (#6380) (by Zhanlue Yang)
[bug] Fix potential bug in #6362 (#6363) (#6371) (by Zhanlue Yang)
[example] Add example "laplace equation" (#6302) (by 猫猫子Official)
[ci] Android Demo: leave Docker containers intact for debugging (#6357) (by Proton)
[autodiff] Skip gradient kernel compilation for validation kernel (#6356) (by Mingrui Zhang)
[autodiff] Move autodiff gdar checker to release (#6355) (by Mingrui Zhang)
[aot] Removed constraint on same-allocation copy (#6354) (by PENGUINLIONG)
[ci] Add new performance monitoring (#6349) (by Proton)
[dx12] Only use llvm to compile dx12. (#6339) (by Xiang Li)
[opengl] Fix with_opengl when TI_WITH_OPENGL is off (#6353) (by Ailing)
[Doc] Add instructions about running clang-tidy checks locally (by Ailing Zhang)
[build] Enable readability-redundant-member-init in clang-tidy check (by Ailing Zhang)
[build] Enable TI_WITH_VULKAN and TI_WITH_OPENGL for clang-tidy checks (by Ailing Zhang)
[build] Enable a few modernize checks in clang-tidy (by Ailing Zhang)
[autodiff] Recover kernel autodiff mode after validation (#6265) (by Mingrui Zhang)
[test] Adjust rtol for sparse_linear_solver tests (#6352) (by Ailing)
[lang] MatrixType bug fix: Fix array indexing with MatrixType-index (#6323) (by Zhanlue Yang)
[Lang] MatrixNdarray refactor part13: Add scalarization for TernaryOpStmt (#6314) (by Zhanlue Yang)
[Lang] MatrixNdarray refactor part12: Add scalarization for AtomicOpStmt (#6312) (by Zhanlue Yang)
[build] Enable a few modernize checks in clang-tidy (by Ailing Zhang)
[build] Enable google-explicit-constructor check in clang-tidy (by Ailing Zhang)
[build] Enable google-build-explicit-make-pair check in clang-tidy (by Ailing Zhang)
[build] Enable a few bugprone related rules in clang-tidy (by Ailing Zhang)
[build] Enable modernize-use-override in clang-tidy (by Ailing Zhang)
[ci] Use .clang-tidy for check_static_analyzer job (by Ailing Zhang)
[mesh] Support arm64 backend for MeshTaichi (#6329) (by Chang Yu)
[lang] Throw proper error message if calling ti.append with vector/matrix (#6322) (by Ailing)
[aot] Fixed buffer device address import (#6326) (by PENGUINLIONG)
[aot] Fixed export of get_instance_proc_addr (#6324) (by PENGUINLIONG)
[build] Allow building test when LLVM is off (#6327) (by Ailing)
[bug] Fix generating LLVM AOT module for the second time failed (#6311) (by PGZXB)
[aot] Per-parameter documentation in C-API header (#6317) (by PENGUINLIONG)
[ci] Revert "Add end-to-end CI tests for meshtaichi (#6321)" (#6325) (by Proton)
[ci] Add end-to-end CI tests for meshtaichi (#6321) (by yixu)
[doc] Update the document about offline cache (#6313) (by PGZXB)
[aot] Include taichi_cpu.h in taich.h (#6315) (by Zhanlue Yang)
[Vulkan] [bug] Change the format string of 64bit unsigned integer type from %llu to %lu (#6308) (by Lin Jiang)
[mesh] Refactor MeshTaichi API (#6306) (by Chang Yu)
[lang] MatrixType bug fix: Allow dynamic_index=True when real_matrix_scalarize=True (#6304) (by Yi Xu)
[lang] MatrixType bug fix: Enable irpass::cfg_optimization if real_matrix_scalarize is on (#6300) (by Zhanlue Yang)
[metal] Enable offline cache by default on Metal (#6307) (by PGZXB)
[Vulkan] Add overflow detection on vulkan when debug=True (#6279) (by Lin Jiang)
[aot] Inline documentations (#6301) (by PENGUINLIONG)
[aot] Support exporting interop info for TiMemory on Cpu/Cuda backends (#6242) (by Zhanlue Yang)
[lang] MatrixType bug fix: Avoid checks for legacy Matrix-class when real_matrix is on (#6292) (by Zhanlue Yang)
[aot] Support setting vector/matrix argument in C++ wrapper of C-API (#6298) (by Ailing)
[lang] MatrixType bug fix: Fix MatrixType validations in build_call_if_is_type() (#6294) (by Zhanlue Yang)
[bug] Fix asserting failed when registering kernels with same name on Metal (#6271) (by PGZXB)
[ci] Add more release tests (#5839) (by Proton)
[lang] MatrixType bug fix: Allow indexing a matrix r-value (#6291) (by Yi Xu)
[bug] Fix duplicate runs with 'run_tests.py --cpp -k' when selecting AOT tests (#6296) (by Zhanlue Yang)
[bug] Fix segmentation fault with TextureOpStmt ir_printer (#6297) (by Zhanlue Yang)
[ci] Add taichi-aot-demo headless demos (#6280) (by Proton)
[bug] Serialize missing fields of metal::TaichiKernelAttributes and metal::KernelAttributes (#6270) (by PGZXB)
[metal] Implement offline cache cleaning on metal (#6272) (by PGZXB)
[aot] Reorganized C-API headers (#6199) (by PENGUINLIONG)
[lang] [bug] Fix setting integer arguments within u64 range but greater than i64 range (#6267) (by Lin Jiang)
[autodiff] Skip gdar checking for user defined grad kernel (#6273) (by Mingrui Zhang)
[bug] Fix AotModuleBuilder::add_compiled_kernel (#6287) (by PGZXB)
[Bug] [lang] Make dimension check for GlobalPtrStmt aware of whether it is a cell access (#6275) (by Yi Xu)
[refactor] Move setting visible device to vulkan instance initialization (by Ailing Zhang)
[bug] Add unit test to detect memory leak from data_oriented classes (#6278) (by Zhanlue Yang)
[aot] Ship runtime *.bc files with C-API for LLVM AOT (#6285) (by Zhanlue Yang)
[bug] Convert non-i32 type indices to i32 for GlobalPtrStmt (#6276) (by Zhanlue Yang)
[Doc] Renamed syntax.md to kernel_function.md, plus miscellaneous edits (#6277) (by Vissidarte-Herman)
[lang] Fixed validation scope (#6262) (by PENGUINLIONG)
[bug] Prevent ti.kernel from directly caching the passed-in arguments to avoid memory leak (#6256) (by Zhanlue Yang)
[autodiff] Add demote atomics before gdar checker (#6266) (by Mingrui Zhang)
[autodiff] Add grad check feature and related test (#6245) (by PhrygianGates)
[lang] Fixed contraction cast (#6255) (by PENGUINLIONG)
[Example] Add karman vortex street example (#6249) (by Zhao Liang)
[ci] Lift GitHub CI timeout (#6260) (by Proton)
[metal] Support offline cache on metal (#6227) (by PGZXB)
[dx12] Add DirectX-Headers as a submodule (#6259) (by Xiang Li)
[bug] Fix link error with TI_WITH_OPENGL:BOOL=ON but TI_WITH_VULKAN:BOOL=OFF (#6257) (by PGZXB)
[dx12] Disable DX12 for cpu only test. (#6253) (by Xiang Li)
[Lang] MatrixNdarray refactor part11: Fuse ExternalPtrStmt and PtrOffsetStmt (#6189) (by Zhanlue Yang)
[Doc] Rename index.md to hello_world.md (#6244) (by Vissidarte-Herman)
[Doc] Update syntax.md (#6236) (by Zhao Liang)
[spirv] Generate OpBitFieldUExtract for BitExtractStmt (#6208) (by Yi Xu)
[Bug] [lang] Allow numpy int as snode dimension (#6211) (by Yi Xu)
[doc] Update document about building and running Taichi C++ tests (#6228) (by PGZXB)
[misc] Disable the offline cache if printing ir is enabled (#6234) (by PGZXB)
[vulkan] [opengl] Enable offline cache by default on Vulkan and OpenGL (#6233) (by PGZXB)
[Doc] Update math_module.md (#6235) (by Zhao Liang)
[Doc] Update debugging.md (#6238) (by Zhao Liang)
[dx12] Add ti.dx12. (#6174) (by Xiang Li)
[lang] Set ret_type for AtomicOpStmt (#6213) (by Ailing)
[Doc] Update global settings (#6201) (by Olinaaaloompa)
[doc] Editorial updates (#6216) (by Vissidarte-Herman)
[Doc] Update hello world (#6191) (by Olinaaaloompa)
[Doc] Update math module (#6203) (by Olinaaaloompa)
[Doc] Update profiler (#6214) (by Olinaaaloompa)
[autodiff] Store if condition in adstack (#6207) (by Mingrui Zhang)
[Doc] Update debugging.md (#6212) (by Zhao Liang)
[Doc] Update debugging.md (#6200) (by Zhao Liang)
[bug] Fixed type inference error with ExternalPtrStmt (#6210) (by Zhanlue Yang)
[example] Request to add my code into examples (#6185) (by JiaoLuhuai)
[Lang] MatrixNdarray refactor part10: Remove redundant MatrixInitStmt generated from scalarization (#6171) (by Zhanlue Yang)
[aot] Apply ti_get_last_error_message() for all C-API test cases (#6195) (by Zhanlue Yang)
[llvm] [refactor] Merge create_call and call (#6192) (by Lin Jiang)
[build] Support executing manually-specified cpp tests for run_tests.py (#6206) (by Zhanlue Yang)
[doc] Editorial updates to field.md (#6202) (by Vissidarte-Herman)
[Lang] MatrixNdarray refactor part9: Add scalarization for AllocaStmt (#6168) (by Zhanlue Yang)
[Lang] Support GPU solve with analyzePattern and factorize (#6158) (by pengyu)
[Lang] MatrixField refactor 9/n: Allow dynamic index of matrix field when real_matrix=True (#6194) (by Yi Xu)
[Doc] Fixed broken links (#6193) (by Olinaaaloompa)
[ir] MatrixField refactor 8/n: Rename PtrOffsetStmt to MatrixPtrStmt (#6187) (by Yi Xu)
[Doc] Update field.md (#6182) (by Zhao Liang)
[bug] Relax dependent Pillow version (#6170) (by Ailing)
[Doc] Update data_oriented_class.md (#6181) (by Zhao Liang)
[Doc] Update kernels and functions (#6176) (by Zhao Liang)
[Doc] Update type.md (#6180) (by Zhao Liang)
[Doc] Update getting started (#6175) (by Zhao Liang)
[llvm] MatrixField refactor 7/n: Simplify codegen for TensorType allocation and access (#6169) (by Yi Xu)
[LLVM] Add runtime overflow detection on LLVM-based backends (#6178) (by Lin Jiang)
Revert "[LLVM] Add runtime overflow detection on LLVM-based backends" (#6177) (by Ailing)
[dx12] Add aot for dx12. (#6099) (by Xiang Li)
[LLVM] Add runtime overflow detection on LLVM-based backends (#6166) (by Lin Jiang)
[doc] C-API documentation & generator (#5736) (by PENGUINLIONG)
[gui] Support for setting the initial position of GGUI window (#6156) (by Mocki)
[metal] Maintain a print string table per kernel (#6160) (by PGZXB)
[Lang] MatrixNdarray refactor part8: Add scalarization for BinaryOpStmt with TensorType-operands (#6086) (by Zhanlue Yang)
[Doc] Refactor debugging (#6102) (by Olinaaaloompa)
[doc] Updated the position of Sparse Matrix (#6167) (by Vissidarte-Herman)
[Doc] Refactor global settings (#6071) (by Zhao Liang)
[Doc] Refactor external arrays (#6065) (by Zhao Liang)
[Doc] Refactor simt (#6151) (by Zhao Liang)
[Doc] Refactor Profiler (#6142) (by Olinaaaloompa)
[Doc] Add doc for math module (#6145) (by Zhao Liang)
[aot] Fixed texture interop (#6164) (by PENGUINLIONG)
[misc] Remove TI_UI namespace macros (#6163) (by Lin Jiang)
[llvm] Add comment about the structure of the CodeGen (#6150) (by Lin Jiang)
[Bug] [lang] Fix augmented assign for sar (#6153) (by Yi Xu)
[Test] Add scipy to test GPU sparse solver (#6162) (by pengyu)
[bug] Fix crashing when loading old offline cache files (for gfx backends) (#6157) (by PGZXB)
[lang] Remove print at the end of parallel sort (#6161) (by Haidong Lan)
[misc] Move some offline cache utils from analysis/ to util/ (#6155) (by PGZXB)
[Lang] Matrix/Vector refactor: support basic matrix ops (#6077) (by Mike He)
[misc] Remove namespace macros (#6154) (by Lin Jiang)
[Doc] Update gui_system (#6152) (by Zhao Liang)
[aot] Track layouts for imported image & tests (#6138) (by PENGUINLIONG)
[ci] Fix build cache problems (#6149) (by Proton)
[Misc] Add prefix sum executor to avoid multiple field allocations (#6132) (by YuZhang)
[opt] Cache loop-invariant global vars to local vars (#6072) (by Lin Jiang)
[aot] Improve C++ wrapper implementation (#6146) (by PENGUINLIONG)
[doc] Refactored ODOP (#6143) (by Vissidarte-Herman)
[Lang] Support basic sparse matrix operations on GPU. (#6082) (by Jiafeng Liu)
[Lang] MatrixField refactor 6/n: Add tests for MatrixField scalarization (#6137) (by Yi Xu)
[vulkan] Fix SPV physical ptr load alignment (#6139) (by Bob Cao)
[bug] Let every thread has its own CompileConfig (#6124) (by Lin Jiang)
[refactor] Remove redundant codegen of floordiv (#6135) (by Yi Xu)
[doc] Miscellaneous editorial updates (#6131) (by Vissidarte-Herman)
Revert "[spirv] Fixed OpLoad with physical address" (#6136) (by Lin Jiang)
[bug] [llvm] Fix is_same_type when the suffix of a type is the prefix of the suffix of the other type (#6126) (by Lin Jiang)
[bug] [vulkan] Only enable non_semantic_info cap when validation layer is on (#6129) (by Ailing)
[Llvm] Fix codegen for div (unsigned) (#6128) (by Yi Xu)
[Lang] MatrixField refactor 5/n: Lower access of matrix field element into CHI IR (#6119) (by Yi Xu)
[Lang] Fix invalid assertion for matrix values (#6125) (by Zhanlue Yang)
[opengl] Fix GLES support (#6121) (by Ailing)
[Lang] MatrixNdarray refactor part7: Add scalarization for UnaryOpStmt with TensorType-operand (#6080) (by Zhanlue Yang)
[doc] Editorial updates (#6116) (by Vissidarte-Herman)
[misc] Allow more commits in changelog generation (#6115) (by Yi Xu)
[aot] Import MoltenVK (#6090) (by PENGUINLIONG)
[vulkan] Instruct users to install vulkan sdk if they want to use validation layer (#6098) (by Ailing)
[ci] Use local caches on self-hosted runners, and code refactoring. (#5846) (by Proton)
[misc] Bump version to v1.1.4 (#6112) (by Taichi Gardener)
[doc] Fixed a broken link (#6111) (by Vissidarte-Herman)
[doc] Update explanation on data-layout (#6110) (by Qian Bao)
[Doc] Move developer utilities to contribution (#6109) (by Olinaaaloompa)
[Doc] Added Accelerate PyTorch (#6106) (by Vissidarte-Herman)
[Doc] Refactor ODOP (#6013) (by Zhao Liang)
[opengl] Support offline cache on opengl (#6104) (by PGZXB)
[build] Fix building with TI_WITH_OPENGL:BOOL=OFF and TI_WITH_DX11:BOOL=ON failed (#6108) (by PGZXB)

taichi-dev/taichi v1.2.0 on GitHub