github apache/tvm v0.22.0.rc0
Apache TVM v0.22.0

pre-release2 days ago

Introduction

The TVM community has worked since the last release to deliver the following new exciting improvements!

The main tags are below (bold text is with lots of progress): Relax (especial PyTorch frontend), FFI etc.

Please visit the full listing of commits for a complete view: v0.22.dev0...v0.22.0.rc0.

Community

None.

RFCs

None.

BugFix

  • #18352 - [Fix] Update ShapeView use in nccl.cc
  • #18324 - Fixing binding for bert
  • #18296 - [Fix] Add libxml2 dependency to fix Windows CI build failure
  • #18294 - [Fix] Set DRefObj and CUDAIPCMemoryObj as mutable
  • #18285 - [FFI]Enable load_inline on macos
  • #18287 - [Hotfix] Fix the conflicts about ffi-related updated names
  • #18281 - [FFI]Fix bug of ffi.cpp.load_inline on Windows
  • #18262 - [NNAPI] Use kind() instead of type_key() after FFI refactor
  • #18244 - [Fix] Update FlashInfer JIT header lookup
  • #18237 - [FFI]Fix type_traits on DataType after SmallStr update
  • #18232 - [LLVM][Fix] Do not emit debuginfo on vscale or other unknown types
  • #18219 - [Fix] Resolve deadlock in PopenPoolExecutor and LocalBuilder
  • #18207 - [Fix][ONNX] No precision widening for numpy binary operations
  • #18209 - [ONNX][FRONTEND][Fix] Update Resize to accept ShapeExpr
  • #18210 - [Bug] Fix core dump in InferLayoutRMSNorm and fix typo
  • #18208 - [FFI][Fix] Update datatype registry calls to the new paths
  • #18190 - [Fix] Codegen fix for relax cutlass
  • #18170 - [Fix] Fix the wrong check for tuple node in #18163
  • #18174 - [Misc]Fix missing PadAttrs register in op_attrs.py
  • #18158 - Fix NCCL build with GlobalDef registration
  • #18140 - [NNAPI] Fix type mismatch and test_mean annotation
  • #18138 - [Fix][ONNX] Fixed constant ROI handling in resize2d when loading onnx models
  • #18137 - [Fix][ONNX] Fix CumSum conversion when loading ONNX model

CI

  • #18245 - [LLVM][MSWIN]Fix LLVM module build with latest CI update
  • #18227 - Exit the build for AbortException
  • #18145 - [Test] Use roi_list variable instead of hardcoded values in ROI tensor creation

Docs

  • #18279 - [FFI]Initial bringup of cpp docs
  • #18264 - Misc docs fix
  • #18263 - [FFI]Initial docs scaffolding
  • #18261 - [FFI]Add missing files in packaging example
  • #18256 - [FFI]Wheel Packaging
  • #18128 - [Doc] Visualize the architecture using a UML sequence diagram

Frontend

  • #18143 - [ONNX] Extend axes for layer_norm when gamma/beta are multi-dimensional

LLVM

  • #18204 - Fixes up to the latest LLVM21
  • #18202 - [CPPTEST] Small fixes for LLVM >= 20

MetaSchedule

  • #18243 - [LLVM]Add RISCV V-extension v1.0 kernels to metaschedule

Metal

  • #18290 - Fix MetalModuleCreate
  • #18283 - [Fix]Fix type for device array in Metal API

ROCm

  • #18225 - Minor fixes for latest refactor

FFI

  • #18375 - [TE] [FFI] Fix broken axis/reduce_axis properties in BaseComputeOp and ScanOp after FFI refactoring
  • #18376 - [FFI] Bump tvm-ffi to 0.1.0rc2
  • #18370 - [FFI] Bump tvm-ffi dependency
  • #18354 - [FFI][ABI] Bump tvm-ffi to latest
  • #18349 - [FFI][ABI] Bump tvm-ffi to latest
  • #18345 - [FFI][ABI] Bump tvm-ffi version to reflect RC ABI Update
  • #18332 - [FFI][ABI] Bump version ffi to latest
  • #18314 - [REFACTOR][FFI] Split tvm-ffi into a separate repo
  • #18312 - [FFI][REFACTOR] Update TVM_FFI_STATIC_INIT_BLOCK to fn style
  • #18311 - [FFI][ABI] Better String and Nested Container handling
  • #18308 - [FFI][ABI] Refactor the naming of DLPack speed converter
  • #18307 - [FFI] Update load_inline interface
  • #18306 - [FFI][ABI][REFACTOR] Enhance DLPack Exchange Speed and Behavior
  • #18302 - [FFI][REFACTOR] Refactor python ffi call mechanism for perf
  • #18298 - [FFI] Fix system library symbol lookup
  • #18297 - [FFI] Temp skip windows tests
  • #18295 - [FFI][ABI] Introduce generic stream exchange protocol
  • #18289 - [FFI][REFACTOR] Streamline Object Declare Macros
  • #18284 - [FFI][REFACTOR] Introduce UnsafeInit and enhance ObjectRef null safety
  • #18282 - [FFI] Relax default alignment and continguous requirement
  • #18280 - [FFI][REFACTOR] Cleanup namespace
  • #18278 - [FFI] Temp skip load_inline tests nonlinux
  • #18277 - [FFI][REFACTOR] Cleanup tvm_ffi python API and types
  • #18276 - [FFI] Add ffi::Tensor.strides()
  • #18275 - [FFI][REFACTOR][ABI] Rename NDArray to Tensor
  • #18274 - [FFI] Update the interface of ffi.load_inline to match torch
  • #18273 - [FFI][ABI] Append symbol prefix for ffi exported functions
  • #18272 - [FFI] Construct NDArray.strides by default
  • #18271 - [FFI] Support inline module
  • #18270 - [FFI] Support Opaque PyObject
  • #18266 - [FFI] Update torch stream getter to use native torch c api
  • #18259 - [FFI][ABI] Introduce weak rc support
  • #18258 - [FFI] fix two seemingly migration issue
  • #18254 - [FFI][ABI] ABI Updates to for future metadata and complex ordering
  • #18249 - [FFI][CMAKE] Revert cmake libbacktrace URL and update submodule
  • #18246 - [FFI][CMAKE] Add missing download path for libbacktrace
  • #18234 - [FFI] Misc fixup for windows
  • #18233 - [FFI] Robustify the pyproject setup
  • #18226 - [FFI][REFACTOR] Establish tvm_ffi python module
  • #18221 - [FFI] Fix JSON parser/writer for the fast-math flag
  • #18218 - [FFI][REFACTOR] Cleanup API locations
  • #18217 - [FFI] AudoDLPack compatible with torch stream context
  • #18216 - [FFI][REFACTOR] Establish Stream Context in ffi
  • #18214 - [FFI][REFACTOR] Establish ffi.Module in python
  • #18213 - [FFI] Formalize ffi.Module
  • #18212 - [FFI] Make JSON Parser/Write fastmath safe
  • #18205 - [FFI][REFATOR] Cleanup entry function to redirect
  • #18200 - [FFI][REFACTOR] Update Map ABI to enable flexible smallMap switch
  • #18198 - [FFI][REFACTOR] Move Downcast out of ffi for now
  • #18192 - [FFI] Phase out ObjectPath in favor of AccessPath
  • #18191 - [FFI][REFACTOR] Refactor AccessPath to enable full tree repr
  • #18189 - [FFI][REFACTOR] Phase out getattr based attribute handling
  • #18188 - [FFI][REFACTOR] Migrate the Save/Load JSON to the new reflection
  • #18187 - [FFI][EXTRA] Serialization To/From JSONGraph
  • #18186 - [FFI] Lightweight json parser/writer
  • #18185 - [FFI] Introduce small string/bytes
  • #18184 - [FFI][REFACTOR] Hide StringObj/BytesObj into details
  • #18183 - [FFI][REFACTOR] Cleanup to align to latest ffi
  • #18172 - [REFACTOR][FFI] Phase out SEqualReduce/SHashReduce
  • #18172 - [REFACTOR][FFI] Phase out SEqualReduce/SHashReduce
  • #18178 - [FFI] Fix SmallMapInit with duplicated keys
  • #18177 - [FFI][REFACTOR] Isolate out extra API
  • #18176 - [FFI] Improve string equal/hash handling
  • #18166 - [FFI][REFACTOR] Migrate StructuralEqual/Hash to new reflection
  • #18165 - [FFI][REFACTOR] Enable custom s_hash/equal
  • #18160 - [FFI][REFACTOR] Introduce TypeAttr in reflection
  • #18156 - [FFI] Structural equal and hash based on reflectionx
  • #18149 - [FFI] Log and throw in function dup registration
  • #18148 - [FFI][REFACTOR] Phase out TVM_FFI_REGISTER_GLOBAL in favor of GlobalDef
  • #18147 - [FFI][REFACTOR] Modularize refelection
  • #18141 - [FFI][PYTHON] Improve the traceback generation in python

Relax

  • #18374 - [PyTorch] improve the check for no bias situation
  • #18358 - [Frontend][ONNX] Fix FastGelu when bias does not set
  • #18360 - [PyTorch] Support gru op for ExportedProgram importer
  • #18359 - [PyTorch] Fix the segfault in from_exported_program when model returns (Tensor, None) tuple
  • #18321 - [ONNX] Support AllClassNMS Operator for ONNX Frontend
  • #18346 - [PyTorch] Support lstm op for ExportedProgram importer
  • #18351 - [Frontend][Torch] Fix parsing error when input dimension of unbind is 1
  • #18331 - Update BasePyModule with faster DLPack converter for tensor conversion
  • #18343 - [PyTorch] Support MatrixMultiply op for ExportedProgram importer
  • #18336 - Operator and RoPE support for Llama4
  • #18329 - [Frontend][ONNX] Error converting operator Expand: TVMError: broadcast_to expects the input tensor shape is broadcastable to the target shape
  • #18326 - [Backend] Implement R.call_py_func operator for calling Python functions from compiled TVM
  • #18313 - Introduce R.call_py_func operator for calling Python functions from Relax IR
  • #18301 - Fix RelaxToPyFuncConverter compatibility and improve fallback handling
  • #18288 - Add symbolic shape support to BasePyModule for dynamic tensor operations
  • #18269 - Add Relax to Python Function Converter
  • #18253 - Building TVMScript printer for IRModules with Python functions
  • #18229 - Add Python function support and BasePyModule for PyTorch integration
  • #18242 - ONNX frontend using relax softplus operator
  • #18180 - [ONNX] Parse ONNX Upsample to Relax resize2d
  • #18179 - Support Relax Operator PReLU
  • #18163 - Fix issue in fuse concat ops by pattern
  • #18120 - [Fix]Fix potential out-of-bounds access in TupleRewriterNode
  • #18061 - [ONNX][Transform] Add mode choice, new mode, and warning for take()
  • #18122 - [KVCache] Fix kernel dispatch based on attention kinds

TIR

  • #18319 - Refactor division simplification in RewriteSimplifier
  • #18341 - Support sequence comparisons in TVMScript
  • #18323 - Add support for conditional expressions in TVMScript
  • #18199 - Fix host/device function check for build
  • #18154 - Fix trivial index map [] -> [0]
  • #18151 - Decouple DeepEqual from StructuralEqual
  • #18134 - Add T.thread_return() for early thread exit in CUDA kernels

TVMScript

  • #17804 - Support continue and break in tvmscript

cuda & cutlass & tensorrt

  • #18353 - [CUDA] Update FlashInfer JIT integration
  • #18320 - [TIR][CUDA] Preserve float precision in codegen with hexfloat output
  • #18300 - [CUDA] Support NVTX in CUDA 13
  • #18238 - [CUTLASS] Fix CUTLASS kernel compilation
  • #18144 - [CodeGen][CUDA] Add sinhf CUDA Math API for CodeGen

web

  • #18327 - [CMake]Install web/ directory in cmake for Python package
  • #18168 - Fix incompatible part after FFI updates

Misc

  • #18330 - [Analyzer] Enhance ConstIntBoundAnalyzer and IntervalSet with modular set analysis
  • #18372 - Upgrade to CUTLASS 4.2.1
  • #18348 - [Python] Add library lookup path for tvm installed as a pakcage
  • #18334 - Fix conflict parameter name promote_dtye in FP8ComputeLegalize
  • #18325 - [flashinfer] Support directing JIT to FlashInfer GroupedGemm kernels
  • #18328 - Fixing datatype error for gpt-2
  • #18318 - [3rdparty] Remove dlpack/libbacktrace from 3rdparty
  • #18317 - [FlashInfer] Update include path and interface
  • #18304 - Clear ext_lib_dll_names for macOS platform
  • #18299 - [Python] Fix runtime tensor import
  • #18252 - [Build] Complete TVM wheel building migration
  • #18236 - upgrade cutlass v4.2.0 supporting cuda 13
  • #18251 - [Python] Complete Python packaging with scikit-build-core
  • #18248 - [Python] Update version.py to bump pyproject.toml automatically
  • #18291 - [3rdparty] Bump cutlass_fpA_intB_gemm to fix SM90 build
  • #18239 - [Build] Migrate Python packaging to pyproject.toml with scikit-build-core
  • #18222 - [NVSHMEM] Fix compatibility with CUDA code without nvshmem use
  • #18220 - [Thrust] Fix getting CUDA stream
  • #18211 - [TARGET]add target for nvidia rtx 5060ti
  • #18206 - [CODEGEN][REFACTOR] tir.call_llvm_intrin to remove nargs
  • #18193 - Bump cutlass_fpA_intB_gemm to latest commit
  • #18197 - [REFACTOR] Update data type rewriter to enable recursive rewrite in Any
  • #18181 - [REFACTOR] Upgrade NestedMsg to use new ffi::Any mechanism
  • #18142 - [REFACTOR] Migrate TVM_FFI_REGISTER_GLOBAL to new reflection style
  • #18130 - Fix compilation warnings of unnecessary std::move() calls
  • #18129 - Delete redundant imports
  • #18055 - [Target] Support CUDA device function calls
  • #18127 - Revert "[Refactor] Build cython with isolate environment"
  • #18125 - Phase out StackVM runtime support
  • #18124 - [Refactor] Build cython with isolate environment
  • #18123 - [Codegen] Update LLVM version requirement for insertDeclare

Don't miss a new tvm release

NewReleases is sending notifications on new releases.