NVIDIA/nvrc
Rust
Captured source
source ↗NVIDIA/nvrc
Description: The NVRC project provides a Rust binary that implements a simple init system for microVMs.
Language: Rust
License: Apache-2.0
Stars: 35
Forks: 16
Open issues: 20
Created: 2024-07-17T15:58:39Z
Pushed: 2026-06-10T18:13:47Z
Default branch: main
Fork: no
Archived: no
README:
NVRC - NVIDIA Runtime Container Init

A minimal init system (PID 1) for ephemeral NVIDIA GPU-enabled VMs running under Kata Containers. NVRC sets up GPU drivers, configures hardware, spawns NVIDIA management daemons, and hands off to kata-agent for container orchestration.
Design Philosophy
Fail Fast, Fail Hard: NVRC is designed for ephemeral confidential VMs where any configuration failure should immediately terminate the VM. There are no recovery mechanisms—if GPU initialization fails, the VM powers off. This "panic-on-failure" approach ensures:
- Security: No undefined states in confidential computing environments
- Simplicity: No complex error recovery logic to audit
- Clarity: If it's running, it's configured correctly
Architecture
flowchart TD
Start([NVRC starts as PID 1]) --> PanicHook[Set panic hook
power off VM on panic]
PanicHook --> MountFS[Mount filesystems
/proc /dev /sys /run /tmp]
MountFS --> LoopbackUp[Bring up loopback interface]
LoopbackUp --> InitKernlog[Initialize kernel logging]
InitKernlog --> PollSyslogOnce[Poll syslog once]
PollSyslogOnce --> ParseKernel[Parse kernel parameters
/proc/cmdline]
ParseKernel --> DetectMode[Detect mode]
DetectMode --> ModeSelect{Mode?}
ModeSelect -->|gpu default| GPUMode[GPU Mode]
ModeSelect -->|cpu| CPUMode[CPU Mode]
ModeSelect -->|servicevm-nvl4| NVL4Mode[ServiceVM NVL4
H100/H200/H800]
ModeSelect -->|servicevm-nvl5| NVL5Mode[ServiceVM NVL5
B100/B200/B300]
GPUMode --> GPUSteps[• Load nvidia.ko nvidia-uvm
• Start nvidia-persistenced
• nvidia-smi: lmc lgc pl srs
• nv-hostengine dcgm-exporter
• Generate CDI spec
• Health checks]
CPUMode --> CPUSteps[• Skip GPU initialization]
NVL4Mode --> NVL4Steps[• Load nvidia.ko
• Start fabric-mgr greedy
• Health checks]
NVL5Mode --> NVL5Steps[• Load ib_umad mlx5_ib
• Detect CX7 port GUID
• Start nvlsm
• Start fabric-mgr symmetric
• Health checks]
GPUSteps --> Lockdown
CPUSteps --> Lockdown
NVL4Steps --> Lockdown
NVL5Steps --> Lockdown
Lockdown[Disable kernel module loading
security lockdown]
Lockdown --> ForkAgent[Fork kata-agent
handoff control to guest agent]
ForkAgent --> PollSyslog[Poll syslog forever
keep PID 1 alive]
style Start fill:#e1f5ff
style PollSyslog fill:#e1f5ff
style GPUMode fill:#c8e6c9
style CPUMode fill:#fff9c4
style NVL4Mode fill:#ffccbc
style NVL5Mode fill:#ffccbcKernel Parameters
NVRC is configured entirely via kernel command-line parameters (no config files). This is critical for minimal init environments where userspace configuration doesn't exist yet.
Core Parameters
| Parameter | Values | Default | Description | | ----------- | ------------------------------------------------ | ------- | ----------------------------------------------------------------------------------------------------------------------------------- | | nvrc.mode | gpu, cpu, nvswitch-nvl4, nvswitch-nvl5 | gpu | Operation mode. cpu for CPU-only, nvswitch-nvl4 for H100/H200/H800 service VMs, nvswitch-nvl5 for B200/B300/B100 service VMs. | | nvrc.log | off, error, warn, info, debug, trace | off | Log verbosity level. Also enables /proc/sys/kernel/printk_devkmsg. |
GPU Configuration
| Parameter | Values | Default | Description | | -------------- | ---------------------- | ------- | -------------------------------------------------------------------------------------------------- | | nvrc.smi.lgc | ` | - | Lock GPU core clocks to fixed frequency. Eliminates thermal throttling for consistent performance. | | nvrc.smi.lmc | | - | Lock memory clocks to fixed frequency. Used alongside lgc for fully deterministic GPU behavior. | | nvrc.smi.pl | | - | Set GPU power limit. Lower values reduce heat/power; higher allows peak performance. | | nvrc.smi.srs | enabled, disabled` | - | Secure Randomization Seed for GPU memory (passed to nvidia-smi). |
Daemon Control
| Parameter | Values | Default | Description | | --------------------------- | --------------------------------------- | -------- | -------------------------------------------------------------------------------------------------- | | nvrc.uvm.persistence.mode | on/off, true/false, 1/0, yes/no | true | UVM persistence mode keeps unified memory state across CUDA context teardowns. | | nvrc.dcgm | on/off, true/false, 1/0, yes/no | false | Enable DCGM (Data Center GPU Manager) for telemetry and health monitoring. | | nvrc.fm.mode | 0, 1 | - | Fabric Manager mode: 0=bare metal, 1=servicevm (shared nvswitch). Auto-set in nvswitch modes. | | nvrc.fm.rail.policy | greedy, symmetric | greedy | Partition rail policy. Symmetric required for Confidential Computing on Blackwell. |
Example Configurations
Minimal GPU setup (defaults):
nvrc.mode=gpu
CPU-only mode:
nvrc.mode=cpu
NVSwitch NVL4 mode (Service VM for HGX H100/H200/H800 - NVLink 4.0):
nvrc.mode=nvswitch-nvl4
NVSwitch NVL5 mode (Service VM for HGX B200/B300/B100 - NVLink 5.0):
nvrc.mode=nvswitch-nvl5
GPU with locked clocks for benchmarking:
nvrc.mode=gpu nvrc.smi.lgc=1500 nvrc.smi.lmc=5001 nvrc.smi.pl=300
GPU with DCGM monitoring:
nvrc.mode=gpu nvrc.dcgm=on nvrc.log=info
Multi-GPU with NVLink:
nvrc.mode=gpu nvrc.fm.mode=0 nvrc.log=debug
Build
NVRC is compiled as a statically-linked musl binary for minimal dependencies:
# x86_64 cargo build --release --target x86_64-unknown-linux-musl # aarch64 cargo build --release --target aarch64-unknown-linux-musl
Build configuration in .cargo/config.toml enables aggressive size optimization and static linking.
Testing
# Unit tests (requires root for some tests) cargo test # Coverage (requires llvm-cov and root) cargo llvm-cov --all-features --workspace # Fuzzing cargo +nightly fuzz run kernel_params # Static analysis cargo clippy --all-features -- -D warnings cargo audit cargo deny check...
Excerpt shown — open the source for the full document.