Full changelog:
- [ir] Improve ExternalTensorShapeAlongAxisStmt IR print result (#2665) (by Yu Chang)
- [ci] Refined release procedures (#2663) (by Jiasheng Zhang)
- [vulkan] More capabilities detection and enabling (#2660) (by Bob Cao)
- [ci] Change build method of ci tests (#2661) (by Jiasheng Zhang)
- [gui] GGUI 3/n: Add dependencies, interfaces, and backend-independent code (#2650) (by Dunfan Lu)
- [lang] [refactor] Add a ExtArray class for external arrays (#2651) (by Yi Xu)
- [ci] Enable torch tests during CI (#2656) (by Yi Xu)
- [ci] Add cuda bin folder to PATH (#2655) (by Dunfan Lu)
- [gui] GGUI 2/n: Add optional graphics queue, compute queue, and surface to EmbeddedVulkanDevice (#2648) (by Dunfan Lu)
- [vulkan] Build and test Vulkan backend in CI (#2647) (by Ye Kuang)
- [ci] Added changelog.py that does not depend on taichi (#2649) (by Jiasheng Zhang)
- [gui] GGUI 1/n: Add necessary cuda structs/enums/functions (#2645) (by Dunfan Lu)
- [vulkan] Use VulkanMemoryAllocator for memory allocation (#2644) (by Bob Cao)
- Improved SPIRV-Tools library search on Linux (#2643) (by masahi)
- [Lang] [refactor] Add Field classes for ti.field/ti.Vector.field/ti.Matrix.field (#2638) (by Yi Xu)
- [ci] Fix mac release and integrate windows release into github (#2641) (by Jiasheng Zhang)
- [misc] [doc] Rename some profiler APIs and add docstring, mark old names as deprecated (#2640) (by rocket)
- [doc] Better CUDA out of memory messages (#2172) (by 彭于斌)
- [Refactor] Split transformer.py into StmtBuilder and ExprBuilder (Stage 2) (#2635) (by xumingkuan)
- [bug] Fix missing ti.template() in rand_vector(n) in examples (#2636) (by xumingkuan)
- [doc] meta: s/alone/along/ (#2616) (by Eric Cousineau)
- [Bug] Fix osx release workflow. (#2633) (by Ailing)
- [vulkan] Rename ManagedVulkanDevice to EmbeddedVulkanDevice (#2578) (by Ye Kuang)
- [ci] Add slash benchmark command for performance monitoring (#2632) (by rocket)