These are the release notes for the cuda-cccl Python package version 0.5.0, dated February 5th, 2026. The previous release was v0.4.5.
cuda-cccl is in "experimental" status, meaning that its API and feature set can change quite rapidly.
Installation
Please refer to the install instructions here
⚠️ Breaking change
Object-based API requires passing operator to algorithm __call__ method
This API change affects only users of the object-based API (expert mode).
Previously, constructing an algorithm object required passing the operator as an argument, but invoking it did not:
# step 1: create algorithm object
transformer = cuda.compute.make_unary_transform(d_input, d_output, some_unary_op)
# step 2: invoke algorithm
transformer(d_in1, d_out1, num_items1) # NOTE: not passing some_unary_op hereThe new behaviour requires passing it in both places:
# step 1: create algorithm object
transformer = cuda.compute.make_unary_transform(d_input, d_output, some_unary_op)
# step 2: invoke algorithm
transformer(d_in1, d_out1, some_unary_op, num_items1) # NOTE: need to pass some_unary_op hereThis change is introduced because in many situations (such as in a loop), the operator itself and the globals/closures it references can change between construction and invocation (or between invocations).
Features
Improvements
- Avoid unnecessary recompilation of stateful operators (#7500)
- Improved cache lookup performance (#7501)
Bug Fixes
- Fix handling of boolean types in cuda.compute (#7389)