RepoNVIDIANVIDIApublished Mar 31, 2026seen 5d

NVIDIA/TensorRT-RTX-EP-ABI

C++

Open original ↗

Captured source

source ↗
published Mar 31, 2026seen 5dcaptured 10hhttp 200method plain

NVIDIA/TensorRT-RTX-EP-ABI

Description: The NVIDIA TensorRT RTX Execution Provider (EP) is an inference deployment solution designed specifically for NVIDIA RTX GPUs, optimized for client-centric use cases.

Language: C++

License: Apache-2.0

Stars: 21

Forks: 1

Open issues: 1

Created: 2026-03-31T09:25:29Z

Pushed: 2026-06-09T06:00:18Z

Default branch: main

Fork: no

Archived: no

README:

NVIDIA TensorRT RTX Execution Provider

The NVIDIA TensorRT RTX Execution Provider (EP) is an inference deployment solution designed specifically for NVIDIA RTX GPUs, optimized for client-centric use cases.

This EP is built as a standalone plugin (onnxruntime_providers_nv_tensorrt_rtx.dll) that implements the ORT EP ABI interfaces (OrtEpFactory, OrtEp, OrtNodeComputeInfo, OrtDataTransferImpl, etc.) introduced in ORT 1.23.0. It does not need to be built together with ONNX Runtime.

The TensorRT RTX EP leverages NVIDIA's TensorRT for RTX engine to accelerate ONNX models on RTX GPUs. It supports RTX GPUs based on Ampere and later architectures (NVIDIA GeForce RTX 30xx and above).

Benefits:

  • Small package footprint — optimized resource usage on end-user systems at just under 200 MB.
  • Faster model compile and load times — leverages just-in-time compilation to build RTX hardware-optimized engines on end-user devices in seconds.
  • Portability — seamlessly use cached models across multiple RTX GPUs.

Contents

  • [Compatibility Matrix](#compatibility-matrix)
  • [Build from Source](#build-from-source)
  • [Prerequisites](#prerequisites)
  • [Quick Build](#quick-build)
  • [Python Wheel](#python-wheel)
  • [Usage](#usage)
  • [C/C++](#cc)
  • [Python](#python)
  • [Documentation](#documentation)
  • [Examples](#examples)
  • [Contributing](#contributing)
  • [License](#license)

Compatibility Matrix

| EP Version | ORT Version | TRT RTX Version | Notes | |------------|-------------|-----------------|-------| | 0.1 | 1.24.0+ | 1.4.x.x | Initial Windows Support | | 0.3 | 1.25.0+ | 1.5.x.x | Linux Support, Weight Streaming, CIG Interop, ORT Version Negotiation |

Build from Source

Prerequisites

| Dependency | Minimum Version | Platform | Notes | |------------|-----------------|----------|-------| | CMake | 3.15 | All | | | Visual Studio | 2019 or 2022 (Desktop C++ workload) | Windows | | | GCC / Clang | C++20-capable | Linux | | | CUDA Toolkit | 12.9+ | All | | | ONNX Runtime SDK | 1.24.0+ | All | | | TensorRT RTX SDK | 1.1.1+ | All | |

Quick Build

Configure and build using standard CMake commands. Three CMake cache variables control where the dependencies are found:

| CMake Variable | Description | |----------------|---------------------------------------------------------------------------------------| | CUDAToolkit_ROOT | Path to the CUDA Toolkit installation (optional as it will be taken from environment) | | ONNXRUNTIME_ROOT | Path to the ONNX Runtime SDK (contains include/ and lib/) | | TRT_RTX_ROOT | Path to the TensorRT RTX SDK (contains include/ and lib/) |

Windows

cmake -B build -G "Visual Studio 17 2022" -A x64 `
-DONNXRUNTIME_ROOT="C:\SDK\onnxruntime-win-x64-1.24.0" `
-DTRT_RTX_ROOT="C:\SDK\TensorRT-RTX-1.1.1.36"
cmake --build build --config Release

Note: If you already have protobuf installed on your system from e.g. winget this will conflict with cmake and fail the configuration.

Windows with vcpkg Package Manager

vcpkg can optionally be used to manage dependencies (protobuf, ONNX, abseil) instead of CMake FetchContent.

cmake -B build -G "Visual Studio 17 2022" -A x64 `
-DONNXRUNTIME_ROOT="C:\SDK\onnxruntime-win-x64-1.24.0" `
-DTRT_RTX_ROOT="C:\SDK\TensorRT-RTX-1.1.1.36" `
-DUSE_VCPKG=ON `
-DCMAKE_TOOLCHAIN_FILE="..\vcpkg\scripts\buildsystems\vcpkg.cmake" `
-DVCPKG_TARGET_TRIPLET=x64-windows-static-md `
-DVCPKG_HOST_TRIPLET=x64-windows
cmake --build build --config Release

Linux

cmake -B build \
-DONNXRUNTIME_ROOT=/path/to/onnxruntime \
-DTRT_RTX_ROOT=/path/to/tensorrt-rtx
cmake --build build

Linux with vcpkg Package Manager

cmake -B build \
-DONNXRUNTIME_ROOT=/path/to/onnxruntime \
-DTRT_RTX_ROOT=/path/to/tensorrt-rtx \
-DUSE_VCPKG=ON \
-DCMAKE_TOOLCHAIN_FILE=../vcpkg/scripts/buildsystems/vcpkg.cmake \
-DVCPKG_TARGET_TRIPLET=x64-linux \
-DVCPKG_HOST_TRIPLET=x64-linux
cmake --build build

The output library is at:

  • Windows: build\Release\onnxruntime_providers_nv_tensorrt_rtx.dll
  • Linux: build/libonnxruntime_providers_nv_tensorrt_rtx.so

Building with Unit Tests

Unit tests are built by default (BUILD_TESTS=ON). D3D12 graphics interop is compiled in automatically on Windows.

cmake -B build -G "Visual Studio 17 2022" -A x64 `
-DONNXRUNTIME_ROOT="C:\SDK\onnxruntime-win-x64-1.25.0" `
-DTRT_RTX_ROOT="C:\SDK\TensorRT-RTX-1.1.1.36"
cmake --build build --config Release

Run the tests:

build\tests\Release\unittests.exe

Note: CIG interop test cases require ORT SDK 1.25+ (ORT_API_VERSION >= 25). With ORT 1.24, only the EP registration smoke test compiles.

See [doc/BUILD_GUIDE.md](doc/BUILD_GUIDE.md) for the full build guide with troubleshooting and integration instructions.

>>>>>> Stashed changes

Usage

The TensorRT RTX EP uses the V2 device-based EP API introduced in ORT 1.23.0. The EP library is registered dynamically at runtime, then devices are enumerated and appended to the session.

C/C++

#include

#include
#include
#include

Ort::Env env(ORT_LOGGING_LEVEL_WARNING, "MyApp");
Ort::SessionOptions session_options;

// 1. Register the EP plugin library
env.RegisterExecutionProviderLibrary(
"NvTensorRTRTXExecutionProvider",
ORT_TSTR("onnxruntime_providers_nv_tensorrt_rtx.dll"));

// 2. Enumerate available EP devices and find TensorRT RTX
Ort::ConstEpDevice trt_device = {};
for (auto& ep_device : env.GetEpDevices()) {
if (std::strcmp(ep_device.EpName(), "NvTensorRTRTXExecutionProvider") == 0) {
trt_device = ep_device;
break;
}
}
if (!trt_device) {
throw std::runtime_error("TensorRT RTX EP device not found");
}

// 3. Append the EP with provider options
Ort::KeyValuePairs ep_options;
ep_options.Add("enable_cuda_graph", "1");
std::vector devices = {trt_device};
session_options.AppendExecutionProvider_V2(env, devices, ep_options);

// 4. Create session
Ort::Session session(env, ORT_TSTR("model.onnx"), session_options);

Python…

Excerpt shown — open the source for the full document.

Notability

notability 3.0/10

New repo with low stars, routine.

NVIDIA has a repo signal matching infrastructure, product and customer.