NVIDIA/torch-harmonics v0.9.1
NVIDIA/torch-harmonics
Captured source
source ↗published May 21, 2026seen 5dcaptured 8hhttp 200method plain
v0.9.1
Repository: NVIDIA/torch-harmonics
Tag: v0.9.1
Published: 2026-05-21T12:03:14Z
Prerelease: no
Release notes:
- Fourier-Bessel filter basis; Hann window basis with per-type init factors via
get_init_factors - Standardized L2 normalization on the unit disk (harmonic, Zernike, Fourier-Bessel); on a disk of radius R the norm equals R via the Jacobian
- New DISCO basis normalization modes
modal(mean-subtracted, reduces spectral leakage) andgeometric(spherical cap area measure) - Deprecated
basis_norm_mode="individual"→"nodal"and"area ratio"→"geometric"(old names emitDeprecationWarning) - Faster DISCO sparsity-pattern setup; OpenMP forward/backward kernels with up to ~55x speedup in some configurations
- Cross-attention (
key != value != query) inAttentionS2,NeighborhoodAttentionS2, andDistributedNeighborhoodAttentionS2 - Serial attention upsampling when
nlon_out % nlon_in == 0: CPU/CUDA/torch upsample kernels and matching reference DistributedNeighborhoodAttentionS2for self-attention and downsampling (distributed upsample not yet implemented)- Optional per-head QK RMS norm (
use_qknorm) forAttentionS2andNeighborhoodAttentionS2; shape checks across attention layers - Fixed Q/K/V projection gain when input dim != embedding dim
- Breaking: default
NeighborhoodAttentionS2scale changed from1/sqrt(k_channels)to1/sqrt(k_channels // num_heads)to match standard MHA head-dim scaling (num_heads > 1) - Faster Legendre coefficient precomputation for SHT layers
- Differentiable
polar_halo_exchangeandget_group_neighborsfor distributed attention - More robust distributed transpose;
_reduceclones beforeall_reducefortorch.compilecompatibility - Fixed Galewsky initial condition NaN from overflow; convolution adapter for mismatched residual channel counts
- Midpoint rule for filter-basis L2 norm integration (O(h^2)); improved
_precompute_convolution_tensor_s2docstring - Expanded attention tests (including upsample); new
tests/test_filter_basis.py; broader layer integrity coverage
Notability
notability 3.0/10Routine library release, low traction