github NVIDIA/cccl python-0.5.0
CCCL Python Libraries (v0.5.0)

4 hours ago

These are the release notes for the cuda-cccl Python package version 0.5.0, dated February 5th, 2026. The previous release was v0.4.5.

cuda-cccl is in "experimental" status, meaning that its API and feature set can change quite rapidly.

Installation

Please refer to the install instructions here

⚠️ Breaking change

Object-based API requires passing operator to algorithm __call__ method

This API change affects only users of the object-based API (expert mode).

Previously, constructing an algorithm object required passing the operator as an argument, but invoking it did not:

# step 1: create algorithm object
transformer = cuda.compute.make_unary_transform(d_input, d_output, some_unary_op)

# step 2: invoke algorithm
transformer(d_in1, d_out1, num_items1)  # NOTE: not passing some_unary_op here

The new behaviour requires passing it in both places:

# step 1: create algorithm object
transformer = cuda.compute.make_unary_transform(d_input, d_output, some_unary_op)

# step 2: invoke algorithm
transformer(d_in1, d_out1, some_unary_op, num_items1)  # NOTE: need to pass some_unary_op here

This change is introduced because in many situations (such as in a loop), the operator itself and the globals/closures it references can change between construction and invocation (or between invocations).

Features

Improvements

  • Avoid unnecessary recompilation of stateful operators (#7500)
  • Improved cache lookup performance (#7501)

Bug Fixes

  • Fix handling of boolean types in cuda.compute (#7389)

Don't miss a new cccl release

NewReleases is sending notifications on new releases.