ReleaseNVIDIANVIDIApublished Mar 6, 2026seen 5d

NVIDIA/warp v1.12.0

NVIDIA/warp

Open original ↗

Captured source

source ↗
published Mar 6, 2026seen 5dcaptured 11hhttp 200method plain

v1.12.0

Repository: NVIDIA/warp

Tag: v1.12.0

Published: 2026-03-06T20:21:37Z

Prerelease: no

Release notes:

Warp v1.12.0

Warp v1.12 adds experimental hardware-accelerated texture sampling on CUDA GPUs, extends tile programming with element-wise arithmetic operators and differentiable FFT, and broadens JAX interoperability with jax.vmap support. This release also introduces subscript-style type hints for better IDE integration, new quaternion and approximate-math builtins, B-spline shape functions in warp.fem, and a collection of utility and diagnostics APIs.

New features

Hardware-accelerated textures

> Experimental. This API may change without a formal deprecation cycle.

Warp v1.12 introduces wp.Texture1D, wp.Texture2D, and wp.Texture3D classes that leverage CUDA texture memory for hardware-accelerated interpolation directly inside Warp kernels. On GPU, texture reads are routed through dedicated texture units that perform filtered lookups in a single instruction, making them ideal for rendering, volume sampling, signed-distance-field queries, and simulation lookup tables. On CPU, a software fallback provides identical semantics so the same kernel code runs on both devices.

import warp as wp
import numpy as np

wp.init()

# 64x64 single-channel height map
data = np.random.rand(64, 64).astype(np.float32)

# Create a 2D texture with bilinear filtering
tex = wp.Texture2D(data, filter_mode=wp.Texture.FILTER_LINEAR)

@wp.kernel
def sample_texture(tex: wp.Texture2D, coords: wp.array[wp.vec2f], out: wp.array[float]):
i = wp.tid()
# Coordinates are in [0, 1]; bilinear interpolation is automatic
out[i] = wp.texture_sample(tex, coords[i], dtype=float)

coords = wp.array(np.random.rand(1024, 2).astype(np.float32), dtype=wp.vec2f)
result = wp.zeros(1024, dtype=float)
wp.launch(sample_texture, dim=1024, inputs=[tex, coords, result])

print(f"Sampled {result.shape[0]} points, range: [{result.numpy().min():.4f}, {result.numpy().max():.4f}]")
# Example output: Sampled 1024 points, range: [0.0069, 0.9793]

Key capabilities:

  • 1D / 2D / 3D texture classes (wp.Texture1D, wp.Texture2D, wp.Texture3D) with matching wp.texture_sample() overloads that accept scalar, vec2f, or vec3f coordinates.
  • Filter modes: FILTER_POINT for nearest-neighbor sampling and FILTER_LINEAR for bilinear (2D) or trilinear (3D) interpolation.
  • Address modes: ADDRESS_WRAP, ADDRESS_CLAMP, ADDRESS_MIRROR, and ADDRESS_BORDER control how out-of-range texture coordinates are handled, configurable per axis.
  • Array interop: Texture objects provide copy_from_array() and copy_to_array() methods to transfer data between wp.array objects and texture memory. A cuda_surface property exposes the CUDA surface handle for advanced interop.
  • Broad dtype support: Textures accept integer and floating-point data types with 1, 2, or 4 channels. Integer types are automatically normalized to floating-point values on read.

Subscript-style type hints

When annotating kernel parameters with call-syntax forms like wp.array(dtype=float), static type checkers such as Pyright and Pylance flag these as errors because the expressions look like constructor calls rather than type annotations. Warp v1.12 adds subscript-style alternatives that are recognized as valid generic aliases (#1216):

# Before (flagged as error by Pyright/Pylance):
@wp.kernel
def my_kernel(a: wp.array(dtype=float), b: wp.array2d(dtype=wp.vec3)):
...

# After (clean subscript syntax):
@wp.kernel
def my_kernel(a: wp.array[float], b: wp.array2d[wp.vec3]):
...

The subscript syntax is supported for all array dimensionalities (wp.array[dtype] through wp.array4d[dtype]) as well as wp.tile[dtype] for tile-typed arguments.

Warp's static type checking compatibility is being improved incrementally, and you may encounter other Pyright/Pylance diagnostics that are not yet resolved. If you run into type checking issues, please report them as sub-issues of #549.

Diagnostics utility

The new wp.print_diagnostics() function displays a comprehensive snapshot of the Warp build and runtime environment (software versions, CUDA information, build flags, and available devices) in a single call (#1221). Two companion helpers, wp.get_cuda_toolkit_version() and wp.get_cuda_driver_version(), return the CUDA toolkit and driver versions as integer tuples (#1172). Together these are useful for debugging environment issues, capturing context in CI logs, and providing system information when filing bug reports.

Quaternion and spatial helpers

Warp v1.12 adds quaternion and spatial transformation helpers: wp.quat_from_euler(), wp.quat_to_euler(), wp.transform_twist(), and wp.transform_wrench() (#1237). The Euler conversion functions accept axis indices (0 = X, 1 = Y, 2 = Z) so you can specify arbitrary rotation-order conventions such as ZYX or XYZ, making them suitable for robotics and animation pipelines:

euler = wp.vec3(0.0, wp.PI / 4.0, 0.0)
q = wp.quat_from_euler(euler, 2, 1, 0) # ZYX convention
print(q) # [0.0, 0.3826834559440613, 0.0, 0.9238795042037964]

Approximate math intrinsics

wp.div_approx() and wp.inverse_approx() expose GPU hardware fast-math instructions (div.approx.f32 and rcp.approx.ftz.f64) for approximate floating-point division and reciprocal, offering higher throughput at reduced precision (#1199). Only floating-point types are supported. On CPU, both functions fall back to exact arithmetic so the same kernel code runs correctly on either device.

Marching cubes lookup tables

The internal marching cubes lookup tables are now exposed as public class attributes on wp.MarchingCubes: CUBE_CORNER_OFFSETS, EDGE_TO_CORNERS, CASE_TO_TRI_RANGE, and TRI_LOCAL_INDICES (#1151). These tables enable custom marching cubes implementations for advanced use cases such as sparse volume extraction or procedural mesh generation without having to duplicate the standard lookup data.

Graph coloring API

wp.utils.graph_coloring_assign(), wp.utils.graph_coloring_balance(), and wp.graph_coloring_get_groups() are now part of the public API (#1145). These graph coloring utilities were originally introduced in warp.sim in v1.5.0 for use with VBDIntegrator and were removed along with the warp.sim module in v1.10.0. They are now re-introduced as standalone functions in wp.utils, independent of any physics module.…

Excerpt shown — open the source for the full document.

Notability

notability 5.0/10

Routine release of GPU simulation library