RepoNVIDIANVIDIApublished Dec 1, 2025seen 5d

NVIDIA/nvalchemi-toolkit-ops

Python

Open original ↗

Captured source

source ↗
published Dec 1, 2025seen 5dcaptured 8hhttp 200method plain

NVIDIA/nvalchemi-toolkit-ops

Description: ALCHEMI Toolkit-Ops is a collection of optimized batch kernels to accelerate computational chemistry and material science workflows.

Language: Python

License: NOASSERTION

Stars: 196

Forks: 28

Open issues: 11

Created: 2025-12-01T18:34:52Z

Pushed: 2026-06-10T21:00:56Z

Default branch: main

Fork: no

Archived: no

README:

NVIDIA ALCHEMI Toolkit-Ops

![PyPI version](https://badge.fury.io/py/nvalchemi-toolkit-ops) ![codecov](https://codecov.io/gh/NVIDIA/nvalchemi-toolkit-ops)

High-performance NVIDIA Warp primitives for computational chemistry

NVIDIA ALCHEMI Toolkit-Ops is a collection of GPU-optimized, batched primitives for accelerating atomistic simulations. High performance compute kernels are written in NVIDIA `warp-lang`.

Key Features

  • Molecular Dynamics kernels: Velocity Verlet (NVE), Langevin (NVT),

Nosé-Hoover Chain (NVT), NPT/NPH ensembles, velocity rescaling

  • Geometry optimization: FIRE and FIRE2 with optional unit cell

optimization

  • Neighbor lists: naive $O(N^2)$ and cell list $O(N)$ algorithms
  • Dispersion corrections via Becke-Johnson damped DFT-D3
  • Electrostatic interactions: Ewald, particle mesh Ewald (PME), and

damped shifted force (DSF) algorithms

  • Differentiable physics: analytical stress tensor (virial) support

for Ewald and PME, enabling stress-based MLIP training

  • NVIDIA Warp core with optional, JIT-compatible PyTorch and JAX

bindings, including autograd support

Kernels are naturally intended to be highly scalable (>100,000 atoms) and generally optimized for high throughput operations (on the order of several microseconds per atom) on GPUs, with batching support.

Use Cases

There are currently three primary use cases where we imagine nvalchemi-toolkit-ops to fit into the ecosystem:

  • Library maintainers and developers are encouraged to benchmark and explore

integrating functionality like neighbor list computation to accelerate existing workflows;

  • Researchers and model developers ideally should be able to rely on

this package (and not implement their own!) for neighbor list computation, interatomic interactions, and so on during method development;

  • Engineers looking to build applications that involve molecular dynamics,

interatomic potentials, and the like can take advantage of optimized and maintained low-level kernels. warp-lang kernels should be sufficiently modular to allow for a high degree of flexibility and reusability.

The combination of being GPU-first and batched should enable the kernels contained in nvalchemi-toolkit-ops to be ready for a wide range of research and production applications.

Example Snippets

We encourage interested readers to browse our hosted documentation. Below are some short snippets that highlight our straightforward API and use cases for PyTorch: see the hosted documentation for Jax details.

Neighbor list in a 2D unit cell with 50,000 atoms

This example uses PyTorch:

import torch
from nvalchemiops.torch.neighbors import neighbor_list

torch.set_default_dtype(torch.float32)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
torch.set_default_device(device)

NUM_ATOMS = 50_000
# arbitrarily scale positions
positions = torch.randn((NUM_ATOMS, 3)) * 10.0
cell = torch.eye(3, dtype=torch.float32).unsqueeze(0)
pbc = torch.tensor([True, True, False], dtype=torch.bool)
cutoff = 6.0
# use padded matrix representation for neighbors, optimal for
# compiled applications that need constant shapes
neighbor_matrix, num_neighbors, shift_matrix = neighbor_list(
positions,
cutoff,
cell=cell,
pbc=pbc,
method="cell_list"
)
# ...or pass `return_neighbor_list=True` for the familiar COO
# `edge_index` format. `method` will also automatically determine
# neighbor algorithm based off system size
edge_index, neighbor_ptr, shifts = neighbor_list(
positions,
cutoff,
cell=cell,
pbc=pbc,
return_neighbor_list=True
)

DFT-D3(BJ) corrections on a batch of molecules

This example assumes you already have concatenated a set of molecules into combined tensors, and have computed some form of neighborhood using the neighbor_list API. Here, we'll demonstrate using the matrix representation:

import torch
from nvalchemiops.torch.interactions.dispersion import dftd3
from nvalchemiops.torch.neighbors import neighbor_list

# the following parameters need to be constructed ahead of time
positions = ... # [num_atoms, 3]
atomic_numbers = ... # [num_atoms]
cell = ... # [num_systems, 3, 3]
pbc = ... # [num_systems, 3]
batch_idx = ... # [num_atoms]
batch_ptr = ... # [num_systems + 1]
# construct neighbor matrix
neighbor_matrix, num_neighbors, shift_matrix = neighbor_list(
positions,
cutoff=..., # on the order of ~20 Angstroms
cell=cell,
pbc=pbc,
batch_idx=batch_idx,
batch_ptr=batch_ptr
)
# DFT-D3 parameters need to be provided, which comprises reference C6 parameters.
# Refer to the user documentation to see the expected structure and data source.
d3_params = ...
# pass everything to the functional interface
d3_energies, d3_forces, coord_nums, d3_virials = dftd3(
positions=positions,
numbers=atomic_numbers,
neighbor_matrix=neighbor_matrix,
neighbor_matrix_shifts=shift_matrix,
batch_idx=batch_idx,
# functional specific DFT-D3 parameters (PBE shown)
a1=0.4289, a2=4.4407, s8=0.7875,
d3_params=d3_params,
compute_virial=True
)

Electrostatics via particle mesh Ewald

This example shows how to compute the per-atom and system energies as well as the forces using the particle mesh Ewald interface.

import torch
from nvalchemiops.torch.interactions.electrostatics import particle_mesh_ewald
from nvalchemiops.torch.neighbors import neighbor_list

# the following parameters need to be constructed ahead of time
positions = ... # [num_atoms, 3]
atomic_numbers = ... # [num_atoms]
cell = ... # [num_systems, 3, 3]
pbc = ... # [num_systems, 3]
atomic_charges = ... # [num_atoms]
# construct neighbor matrix
neighbor_matrix, num_neighbors, shift_matrix = neighbor_list(
positions,
cutoff=..., # on the order of ~20 Angstroms
cell=cell,
pbc=pbc,
)
# call PME, using automatic parameter tuning
atom_energies, atom_forces = particle_mesh_ewald(
positions=positions,
charges=atomic_charges,
cell=cell,…

Excerpt shown — open the source for the full document.

Notability

notability 6.0/10

NVIDIA toolkit with moderate traction