Full changelog:
- [doc] Remove links from documentation articles to API reference (#2866) (by Chengchen(Rex) Wang)
- [Lang] Support struct fors on ti.any_arr (#2857) (by Yi Xu)
- [ci] M1 release (#2855) (by Jiasheng Zhang)
- [Doc] Update the API reference section. (#2856) (by Chengchen(Rex) Wang)
- [LLVM] [Bug] fix typo of PR #2781 (#2854) (by rocket)
- [Vulkan] Use reference counting based wrapper layer (#2849) (by Bob Cao)
- [gui] Two sided mesh (#2851) (by Dunfan Lu)
- [refactor] Make ti.ext_arr a special case of ti.any_arr (#2850) (by Yi Xu)
- [Lang] Add ti.Vector.ndarray and ti.Matrix.ndarray (#2808) (by Yi Xu)
- [ci] No need to specify arch on M1 CI. (#2845) (by Ailing)
- [Lang] Customized struct support (#2627) (by Andrew Sun)
- [Lang] Fix ti test parameters (#2830) (by squarefk)
- [ci] Enable verbose on M1 CI to collect more info on hanging jobs. (#2844) (by Ailing)
- [ci] Fixed bug of wrong os parameter (#2843) (by Jiasheng Zhang)
- [gui] GGUI 17/n: doc (#2842) (by Dunfan Lu)
- [gui] GGUI 16/n: examples (#2841) (by Dunfan Lu)
- [refactor] Move FrontendContext from global into Callable class. (by Ailing Zhang)
- [refactor] Decouple AsyncEngine with Program. (by Ailing Zhang)
- [refactor] Decouple MemoryPool with Program. (by Ailing Zhang)
- [refactor] Decouple opengl codegen compile with Program. (by Ailing Zhang)
- [refactor] Unify compile() for LlvmProgramImpl and MetalProgramImpl. (by Ailing Zhang)
- [refactor] Initial MetalProgramImpl implementation. (by Ailing Zhang)
- [gui] GGUI small fixups (#2840) (by Dunfan Lu)
- [ci] Fixed bugs of double env in release.yml (#2838) (by Jiasheng Zhang)
- [Lang] Let rescale_index support SNode as input parameter (#2826) (by Jack12xl)
- [refactor] Minor cleanup in program.cpp. (by Ailing Zhang)
- [gui] GGUI 15/n: Python-side code (#2832) (by Dunfan Lu)
- [gui] GGUI 14/n: Shaders (#2829) (by Dunfan Lu)
- [Lang] Experimental sparse matrix support on CPUs (#2792) (by FantasyVR)
- [Vulkan] Add relaxed FIFO presentation mode (#2828) (by Bob Cao)
- [ci] Conditional build matrix on release (#2819) (by Jiasheng Zhang)
- [gui] GGUI 13/n: Pybind stuff (#2825) (by Dunfan Lu)
- [gui] GGUI 12/n: Window and Canvas (#2824) (by Dunfan Lu)
- [gui] GGUI 11/n: Renderer (#2818) (by Dunfan Lu)
- [gui] GGUI 7.5/n: Avoid requiring CUDA toolchains to compile GGUI (#2821) (by Dunfan Lu)
- [vulkan] Let me pass (#2823) (by Dunfan Lu)
- [ci] Add timeout for every job in presubmit.yml (#2820) (by Jiasheng Zhang)
- [Vulkan] Device API Multi-streams, multi-queue, and initial multi-thread support (#2802) (by Bob Cao)
- [Doc] Fix example path and conda instruction link (#2815) (by Bo Qiao)
- [Lang] Fix unfolding subscripting inside ti.external_func_call() (#2806) (by squarefk)
- Enable tensor subscripting as input for external function call (#2812) (by squarefk)
- [gui] GGUI 10/n: IMGUI (#2809) (by Dunfan Lu)
- [Vulkan] [ci] Enable and release Vulkan (#2795) (by Chang Yu)
- [Vulkan] Fixing floating point load/store/atomics on global temps and context buffers (#2796) (by Bob Cao)
- [gui] GGUI 9/n: Renderables and Scene (#2803) (by Dunfan Lu)
- [vulkan] Fix bug in empty root buffer (#2807) (by Chang Yu)
- [Lang] Fix tensor based grouped ndrange for (#2800) (by squarefk)
- [gui] GGUI 8/n: Renderable class (#2798) (by Dunfan Lu)