NVIDIA/MatX
C++
Captured source
source ↗NVIDIA/MatX
Description: An efficient C++20 GPU numerical computing library with Python-like syntax
Language: C++
License: BSD-3-Clause
Stars: 1430
Forks: 117
Open issues: 20
Created: 2021-09-14T22:49:10Z
Pushed: 2026-06-09T23:37:19Z
Default branch: main
Fork: no
Archived: no
README:
MatX - GPU-Accelerated Numerical Computing in Modern C++

MatX is a modern C++ library for numerical computing on NVIDIA GPUs and CPUs. Near-native performance can be achieved while using a simple syntax common in higher-level languages such as Python or MATLAB.

The above image shows the Python (Numpy) version of an FFT resampler next to the MatX version. The total runtimes of the NumPy version, CuPy version, and MatX version are shown below:
- Python/Numpy: 5360ms (Xeon(R) CPU E5-2698 v4 @ 2.20GHz)
- CuPy: 10.6ms (A100)
- MatX: 2.54ms (A100)
While the code complexity and length are roughly the same, the MatX version shows a 2100x over the Numpy version, and over 4x faster than the CuPy version on the same GPU.
Key features include:
- :zap: MatX is fast. By using existing, optimized libraries as a backend, and efficient kernel generation when needed, no hand-optimizations
are necessary
- :open_hands: MatX is easy to learn. Users familiar with high-level languages will pick up the syntax quickly
- :bookmark_tabs: MatX easily integrates with existing libraries and code
- :sparkler: Visualize data from the GPU right on a web browser
- :arrow_up_down: IO capabilities for reading/writing files
Table of Contents
- [Requirements](#requirements)
- [Installation](#installation)
- [Building MatX](#building-matx)
- [Integrating MatX With Your Own Projects](#integrating-matx-with-your-own-projects)
- [Documentation](#documentation)
- [Supported Data Types](#supported-data-types)
- [Unit Tests](#unit-tests)
- [Quick Start Guide](#quick-start-guide)
- [Release History](#release-history)
- [Filing Issues](#filing-issues)
- [Contributing Guide](#contributing-guide)
Requirements
MatX support is currently limited to Linux only due to the time to test Windows. If you'd like to voice your support for native Windows support using Visual Studio, please comment on the issue here: https://github.com/NVIDIA/MatX/issues/153.
MatX is using features in C++20 and the latest CUDA compilers and libraries. For this reason, when running with GPU support, CUDA 12.2.1 and g++9, nvc++ 24.5, or clang 17 or newer is required. You can download the CUDA Toolkit here.
MatX has been tested on and supports Volta, Ampere, Ada, Hopper, and Blackwell GPU architectures. Jetson products are supported with Jetpack 5.0 or above.
The MatX build system when used with CMake will automatically fetch packages from the internet that are missing or out of date. If you are on a machine without internet access or want to manage the packages yourself, please follow the offline instructions and pay attention to the required versions of the dependencies.
Note for CPU/Host support: CPU/Host execution support is nearly on par with GPU support. Currently all elementwise operators, reductions, and FFT/BLAS/LAPACK transforms are supported. Most host functions with the exception of reductions support multithreading. If you find a bug in an operator on CPU, please report it in the issues above. More detail can be found here documentation.
Installation
MatX is a header-only library that does not require compiling for using in your applications. However, building unit tests, benchmarks, or examples must be compiled. CPM is used as a package manager for CMake to download and configure any dependencies. If MatX is to be used in an air-gapped environment, CPM can be configured to search locally for files. Depending on what options are enabled, compiling could take very long without parallelism enabled. Using the `-j flag on make` is suggested with the highest number your system will accommodate.
Building MatX
To build all components, issue the standard cmake build commands in a cloned repo:
mkdir build && cd build cmake -DMATX_BUILD_TESTS=ON -DMATX_BUILD_BENCHMARKS=ON -DMATX_BUILD_EXAMPLES=ON -DMATX_BUILD_DOCS=OFF .. make -j
By default CMake will target the GPU architecture(s) of the system you're compiling on. If you wish to target other architectures, pass the CMAKE_CUDA_ARCHITECTURES flag with a list of architectures to build for:
cmake .. -DCMAKE_CUDA_ARCHITECTURES="80;90"
By default nothing is compiled. If you wish to compile certain options, use the CMake flags below with ON or OFF values:
MATX_BUILD_TESTS MATX_BUILD_BENCHMARKS MATX_BUILD_EXAMPLES MATX_BUILD_DOCS
For example, to enable unit test building:
mkdir build && cd build cmake -DMATX_BUILD_TESTS=ON .. make -j
Integrating MatX With Your Own Projects
MatX uses CMake as a first-class build generator, and therefore provides the proper config files to include into your own project. There are typically two ways to do this: 1. Adding MatX as a subdirectory 2. Installing MatX to the system
1. MatX as a Subdirectory
Adding the subdirectory is useful if you include the MatX source into the directory structure of your project. Using this method, you can simply add the MatX directory:
add_subdirectory(path/to/matx)
Subproject builds do not generate MatX package config files or install rules by default. If your parent project needs to install MatX for redistribution, enable that behavior before adding the subdirectory:
set(MATX_GENERATE_CONFIG ON) add_subdirectory(path/to/matx)
An example of using this method can be found in the [examples/cmake_sample_project](examples/cmake_sample_project) directory.
2. MatX Installed to the System
The other option is to install MatX and use the configuration file provided after building. This is typically done in a way similar to what is shown below:
cd /path/to/matx mkdir build && cd build cmake ..…
Excerpt shown — open the source for the full document.