RepoNVIDIANVIDIApublished Mar 31, 2026seen 5d

NVIDIA/NeMo-Relay

Rust

Open original ↗

Captured source

source ↗
published Mar 31, 2026seen 5dcaptured 9hhttp 200method plain

NVIDIA/NeMo-Relay

Description: Multi-language agent runtime for execution scope management, lifecycle events, and middleware on tool and LLM calls.

Language: Rust

License: Apache-2.0

Stars: 56

Forks: 26

Open issues: 5

Created: 2026-03-31T23:22:24Z

Pushed: 2026-06-11T01:19:13Z

Default branch: main

Fork: no

Archived: no

README:

![Codecov](https://app.codecov.io/gh/NVIDIA/NeMo-Relay) ![Ask DeepWiki](https://deepwiki.com/NVIDIA/NeMo-Relay)

NVIDIA NeMo Relay

What Is NeMo Relay?

NVIDIA NeMo Relay is a portable execution runtime for agent systems that already have a framework, model provider, policy layer, or observability backend. It gives those systems one consistent way to describe, control, and observe what happens when an agent crosses a request, tool, or LLM boundary.

Agent applications rarely live inside one clean abstraction. A production stack might combine NeMo Agent Toolkit, LangChain, LangGraph, provider SDKs, custom harness code, NeMo Guardrails, tracing systems, and evaluation pipelines. NeMo Relay sits underneath those choices as the shared runtime contract for scopes, middleware, plugins, lifecycle events, adaptive behavior, and observability.

Built as a Rust core with primary Rust, Python, and Node.js bindings, NeMo Relay lets applications keep their orchestration model while runtime behavior stays consistent across frameworks and languages.

Why Use It?

  • 🧭 Own execution context across the whole agent run: Hierarchical scopes

attach tools, LLM calls, middleware, subscribers, and events to the same parent-child execution tree.

  • 🛡️ Package policy once: Guardrails and intercepts can block work, sanitize

observability payloads, transform requests, or wrap execution without rewriting every call site.

  • 📡 Emit one lifecycle stream: Subscribers consume canonical runtime events

in-process or export them as ATIF v1.7 trajectories, OpenTelemetry traces, or OpenInference-compatible traces.

  • 🧩 Integrate without a framework migration: NeMo Relay can sit below NeMo

ecosystem components, third-party agent frameworks, provider adapters, or direct application code.

  • ⚙️ Install reusable runtime behavior: Plugins configure middleware,

subscribers, adaptive components, observability exporters, and custom runtime behavior from one shared system.

What You Get

  • Managed tool and LLM execution: Run call boundaries through consistent

lifecycle helpers and middleware ordering.

  • Concurrent request isolation: Keep request-local middleware and

subscribers attached to the scope that owns them, then clean them up when that scope closes.

  • Multi-language semantics: Use the same runtime model from Rust, Python,

and Node.js.

  • Observability-ready events: Preserve model metadata, tool call IDs,

inputs, outputs, scope relationships, and lifecycle timing for downstream analysis.

  • Built-in observability plugin: Configure Agent Trajectory Observability

Format (ATOF), ATIF, OpenTelemetry, and OpenInference exporters without registering subscribers by hand.

  • Non-blocking subscriber delivery: Keep managed execution moving while

subscriber callbacks and exporters drain in the background. Flush subscribers before relying on callback side effects or exported files in tests and shutdown paths.

  • Extension points for framework authors: Wrap stable tool and provider

callbacks while preserving framework-owned scheduling, retries, memory, and result handling.

flowchart LR
App[Application or Framework]

subgraph Runtime[NeMo Relay Runtime]
direction TB
Scopes[Scopes]
Middleware[Middleware]
Plugins[Plugins]
Events[Lifecycle Events]
end

Output[Subscribers and Exporters]

App --> Scopes
App --> Middleware
Plugins --> Middleware
Scopes --> Events
Middleware --> Events
Events --> Output

Installation

Install the published package for your language:

# Rust
cargo add nemo-relay

# Python
uv add nemo-relay

# Node.js
npm install nemo-relay-node

The Node.js package requires Node.js 24 or newer.

CLI Installation

The NeMo Relay CLI is offered as a separate crate:

cargo install nemo-relay-cli

If cargo-binstall is available on your machine:

cargo binstall nemo-relay-cli

For source builds, testing, and contribution workflow, see [CONTRIBUTING.md](CONTRIBUTING.md).

Documentation

End-user documentation lives at docs.nvidia.com/nemo/relay.

The primary documentation track covers Rust, Python, and Node.js.

The Go, WebAssembly, and raw FFI surfaces are currently experimental and remain source-first under go/nemo_relay, crates/wasm, and crates/ffi.

Binding Status

The table below summarizes the support level for each binding surface.

| Binding | Status | Notes | |---|---|---| | Python | ✅ Fully Supported | Fully documented with Quick Start and Guides | | Node.js | ✅ Fully Supported | Fully documented with Quick Start and Guides | | Rust | ✅ Fully Supported | Fully documented with Quick Start and Guides | | NeMo Relay CLI | 🚧 Experimental | Install with cargo install nemo-relay-cli. | | Go | 🚧 Experimental | Source-first under go/nemo_relay. | | WebAssembly | 🚧 Experimental | Source-first under crates/wasm. | | FFI | 🚧 Experimental | Source-first under crates/ffi. |

Agent Harness Support

NeMo Relay CLI offers experimental support for several agent harnesses. Refer to the NeMo Relay CLI documentation for additional information.

Below is our support matrix for agent harnesses.

| Agent | Observability | Security | Optimization | Notes | |:--|:--:|:--:|:--:|:--| | Claude Code | ✅ Yes | ⚠️ Partial | ⚠️ Partial | Tool guardrail support is wired up. LLM optimization is in place. | | Codex | ✅ Yes | ⚠️ Partial | ⚠️ Partial | Tool guardrail support is wired up. LLM optimization is in place. Missing some necessary hooks for full feature parity. | | Hermes Agent | ✅ Yes | ⚠️ Partial | ⚠️ Partial | Tool guardrail support is wired up. LLM optimization is in place. | | Cursor | ✅ Yes | ⚠️ Partial | ⚠️ Partial | Tool guardrail support is wired up. LLM optimization is in place. Not feature-rich, missing hooks under cursor-agent |

Third-Party Integrations

Some framework integrations are maintained as packages…

Excerpt shown — open the source for the full document.

Notability

notability 6.0/10

New NVIDIA repo, moderate traction