NVIDIA/cccl python-0.7.0
NVIDIA/cccl
Captured source
source ↗CCCL Python Libraries (v0.7.0)
Repository: NVIDIA/cccl
Tag: python-0.7.0
Published: 2026-05-05T15:15:33Z
Prerelease: no
Release notes:
cuda-cccl Python package — version 0.7.0
Release date: May 5th, 2026. Previous release: v0.6.0.
cuda-cccl is in "experimental" status, meaning that its API and feature set can change quite rapidly.
Installation
Please refer to the install instructions here
API breaking changes
- All `cuda.compute` functions now require keyword-only arguments (#8772)
Every top-level function and factory (make_*) in cuda.compute now enforces keyword-only call syntax (i.e., all parameters must be passed by name). Positional calls will raise a TypeError.
Before:
reduce_into(d_in, d_out, op, num_items, h_init)
After:
reduce_into(d_in=d_in, d_out=d_out, num_items=num_items, op=op, h_init=h_init)
Features
- System CUDA toolkit install extras — New pip extras
sysctk12/sysctk13(and
minimal-sysctk12 / minimal-sysctk13) allow installing cuda-cccl without pulling in cuda-toolkit as a pip dependency, for users who already have CUDA installed system-wide (#8608):
pip install cuda-cccl[sysctk13] # full install, system CTK pip install cuda-cccl[minimal-sysctk13] # no Numba, system CTK
Performance
- Faster binary search —
lower_bound/upper_boundare now implemented viatransform
with a small linear search for the final steps, improving throughput on modern GPUs (#8642)
- Adaptive warpspeed scan — The scan tuning policy now automatically selects the warpspeed
(lookahead) scan path when beneficial for the data type and architecture (#8158)
Bug Fixes
- Fix incorrect minimum CUDA architecture targeted when building the
cccl.cnative extension
(#8631)
Notability
notability 3.0/10Routine library release, no major traction