NVIDIA/cuEquivariance v0.10.0
NVIDIA/cuEquivariance
Captured source
source ↗published Apr 22, 2026seen 5dcaptured 13hhttp 200method plain
v0.10.0
Repository: NVIDIA/cuEquivariance
Tag: v0.10.0
Published: 2026-04-22T01:20:18Z
Prerelease: no
Release notes:
Added
- Python 3.14 support finalized, including a fix for stale tuple hashes in
SegmentedTensorProductafter in-place operand mutation, and updated CI matrix (#272) - [Torch/JAX]
cuet.triangle_attention/cuex.triangle_attention: new faster sm100f (CC 10.0/10.3) forward kernel for hidden_dim ≤ 256, bwd hidden_dim ≤ 128;biasis cast to q/k/v dtype (instead of always float32) under sm100f; non-contiguous input tensors are handled internally — no manual contiguity assertion is required as long as shape requirements are met; updated docstrings. Only available on cu13 builds (#260) - [JAX] MACE
flax.nnxexample restructured to usennx.split+@jax.jiton(graphdef, state)instead of@nnx.jiton the module, removing the Python-side nnx graph traversal overhead from each training/inference step (#261) - [JAX] NVTX markers added to the MACE examples to make step boundaries visible in
nsysprofiles (#266)
Bug fix
- [Torch]
SegmentedPolynomialcheckpoint portability: GPU-saved models now load correctly on CPU. Implemented via__reduce__onSegmentedPolynomialFromUniform1dJit,SegmentedPolynomialFusedTP,SegmentedPolynomialIndexedLinear, andSegmentedPolynomial, plus graceful fallback when specificcuequivariance_ops_torchextensions (e.g.uniform_1d) are unavailable (#270) - [Torch] Replaced deprecated
is_fx_tracingwithis_fx_symbolic_tracing(#270) - [JAX] Restrict PTX 88 to sm_121 for CUDA 12.9+, avoiding breakage on other architectures (addresses the known issue noted in the 0.9.0 release) (#250)
- [Torch/JAX]
cuet.attention_pair_bias/cuex.attention_pair_bias: fixed incorrect results when the hidden dimension is not a multiple of 32; the previous torch fallback for these cases is removed as the kernel now handles them correctly
Notes
- [Torch] The
CUEQ_TORCH_COMPILEenvironment variable (experimental) enablestorch.compileforcuet.triangle_attention; useful for non-contiguous tensor inputs on Ampere/Hopper architectures
Documentation
- Fixed tutorial format issues (#274)
What's Changed
- Fix __eq__ / __lt__ on segmented polynomial types for Python and JAX compatibility by @mariogeiger in https://github.com/NVIDIA/cuEquivariance/pull/258
- [merge after release] Restrict PTX 88 to sm_121 for CUDA 12.9+ by @hsadasivan in https://github.com/NVIDIA/cuEquivariance/pull/250
- doc string and api update for triattn by @hsadasivan in https://github.com/NVIDIA/cuEquivariance/pull/260
- api: remove dim_order from triangle_attention by @hsadasivan in https://github.com/NVIDIA/cuEquivariance/pull/262
- Removed dim_order from triAttn by @phiandark in https://github.com/NVIDIA/cuEquivariance/pull/263
- nnx.split by @mariogeiger in https://github.com/NVIDIA/cuEquivariance/pull/261
- add nvtx marker by @paulz-nv in https://github.com/NVIDIA/cuEquivariance/pull/266
- Fixing some torch segmented_polynomial support by @phiandark in https://github.com/NVIDIA/cuEquivariance/pull/270
- Add Python 3.14 support and fix CI setup by @mariogeiger in https://github.com/NVIDIA/cuEquivariance/pull/272
- Fix tutorials doc format issues by @LiamZhang100 in https://github.com/NVIDIA/cuEquivariance/pull/274
- Add skill.md files by @mariogeiger in https://github.com/NVIDIA/cuEquivariance/pull/269
- Release 0.10.0 by @mariogeiger in https://github.com/NVIDIA/cuEquivariance/pull/275
New Contributors
- @paulz-nv made their first contribution in https://github.com/NVIDIA/cuEquivariance/pull/266
Full Changelog: https://github.com/NVIDIA/cuEquivariance/compare/v0.9.0...v0.10.0
Notability
notability 3.0/10Routine software release from NVIDIA