NVIDIA/nvshmem v3.5.21-0
NVIDIA/nvshmem
Captured source
source ↗NVSHMEM 3.5.21-0
Repository: NVIDIA/nvshmem
Tag: v3.5.21-0
Published: 2026-02-28T17:19:05Z
Prerelease: no
Release notes:
NVIDIA® NVSHMEM 3.5.21 Release Notes
NVSHMEM is an implementation of the OpenSHMEM specification for NVIDIA GPUs. The NVSHMEM programming interface implements a Partitioned Global Address Space (PGAS) model across a cluster of NVIDIA GPUs. NVSHMEM provides an easy-to-use interface to allocate memory that is symmetrically distributed across the GPUs. In addition to a CPU-side interface, NVSHMEM provides a NVIDIA® CUDA® kernel-side interface that allows CUDA threads to access any location in the symmetrically-distributed memory.
The release notes describe the key features, software enhancements and improvements, and known issues for NVSHMEM 3.5.21 and earlier releases.
Key Features and Enhancements
This NVSHMEM release includes the following key features and enhancements:
- Fixed a bug that was related to ABI compatibility breakage for the internal team structure.
NVSHMEM4Py release 0.2.2 includes the following:
- Removed an incorrect assumption that any NVSHMEM4Py-managed buffers will have at most one child buffer (peer or multicast)
Compatibility
NVSHMEM 3.5.21 has been tested with the following:
NCCL:
- 2.28.3
CUDA Toolkit:
- 12.4
- 12.9
- 13.0
- 13.1
CPUs:
- *x86* and NVIDIA Grace™ processors
GPUs:
- NVIDIA Ampere A100
- NVIDIA Hopper™
- NVIDIA Blackwell
Limitations
Same as 3.5.19
Known Issues
- The internal layout of RC-connected QPs changed starting in 3.5.19 causing ABI compatibility breakage when enabling IBGDA.