NVIDIA/cuEquivariance v0.9.0
NVIDIA/cuEquivariance
Captured source
source ↗published Feb 17, 2026seen 5dcaptured 10hhttp 200method plain
v0.9.0
Repository: NVIDIA/cuEquivariance
Tag: v0.9.0
Published: 2026-02-17T22:56:08Z
Prerelease: no
Release notes:
0.9.0 (2026-02-17)
Added
- GB10 (DGX Spark) support
- Support for Python 3.13 and 3.14
- [JAX] Support for Triton 3.6.0
- [JAX]
flax.nnxMACE example - [Torch/JAX] Deterministic indexing mode for uniform 1d kernels
- [Torch/JAX] Parallel JIT compilation for uniform_1d kernels with per-kernel caching, significantly reducing compilation time. New optional environment variable
CUEQUIVARIANCE_OPS_NVRTC_CACHE_DIRallows setting a directory for caching compiled kernels. - Documentation: new tutorials for JAX and PyTorch segmented polynomials
Bug fix
- [JAX] Fixed Triton tuning issue for triangular multiplicative update
- [JAX] Compatibility with JAX 0.8.2: fixed FFI interface and dtype casting issues when x64 mode is not enabled
- [JAX] Improved triangle attention error messages
- [Torch/JAX] Fixed
yx_rotationdescriptor - [Torch] TensorRT QDP plugin workaround
Breaking Changes
- [Torch/JAX] The environment variable
CUEQUIVARIANCE_OPS_USE_JITno longer exists. JIT compilation is now the default behavior for uniform_1d kernels (already since few releases). - [Torch/JAX] Renamed
filter_drop_unsued_operandstofilter_drop_unused_operands(typo fix) - [Torch/JAX] Removed
nvfatbinoptional dependency - [Torch] Removed deprecated primitive classes:
TensorProduct,EquivariantTensorProduct,SymmetricTensorProduct, andIWeightedSymmetricTensorProduct. Usecuet.SegmentedPolynomialwithmethod='uniform_1d'instead, or the high-level APIs (cuet.ChannelWiseTensorProduct,cuet.FullyConnectedTensorProduct,cuet.SymmetricContraction). Attempting to import these classes will raise anImportErrorwith migration instructions. - [Torch] Removed deprecated low-level wrapper classes:
TensorProductUniform1d,TensorProductUniform4x1d,TensorProductUniform3x1dIndexed,TensorProductUniform4x1dIndexed, andSymmetricTensorContractionfromcuequivariance_ops_torch. Usetorch.ops.cuequivariance.uniform_1dorcuet.SegmentedPolynomialinstead.
Notes
- [JAX] DGX Spark/GB10 (sm_121) with CUDA 12.9: This release uses PTX 87, which works correctly for most architectures but is not compatible with DGX Spark/GB10 on CUDA 12.9. To enable DGX Spark/GB10 support with CUDA 12.9, refer to #250 for a simple frontend integration tweak that restricts PTX 88 to sm_121 only. This fix will be merged after the 0.9.0 release.
Notability
notability 6.0/10Notable library release from NVIDIA for equivariant NNs