RepoNVIDIANVIDIApublished Dec 15, 2025seen 5d

NVIDIA/cuCascade

C++

Open original ↗

Captured source

source ↗
published Dec 15, 2025seen 5dcaptured 9hhttp 200method plain

NVIDIA/cuCascade

Description: GPU Memory Reservation Library

Language: C++

License: Apache-2.0

Stars: 50

Forks: 24

Open issues: 17

Created: 2025-12-15T17:53:16Z

Pushed: 2026-06-10T20:34:48Z

Default branch: main

Fork: no

Archived: no

README:

cuCascade

A high-performance GPU memory management library for data-intensive applications requiring intelligent tiered memory allocation across GPU, host, and disk storage.

Overview

Key Features:

  • Tiered Memory Management: Seamlessly manage GPU (fastest), pinned host (medium), and disk (largest capacity) memory tiers, provides numa aware allocators
  • Memory Reservation System: Avoid oversubscribing your GPU by making reservations and using allocators that respect reservations
  • Hardware Topology Discovery: Automatic detection of NUMA regions and GPU-CPU affinity for optimal memory placement
  • Stream-Aware Tracking: Per-stream memory usage tracking and reservation enforcement
  • cuDF Integration: Native support for GPU DataFrames with batch processing capabilities and spilling to Host or Disk
  • Pluggable Policies: Control what happens when you OOM, try to allocate more than a reservation, how you pick what data to spill, by creating policies that plug into the system.

Getting Started

# Option A: Using Pixi (recommended)
curl -fsSL https://pixi.sh/install.sh | bash
git clone https://github.com/nvidia/cuCascade.git
cd cuCascade
pixi install
pixi run build

# Build in debug mode
pixi run build-debug

# Option B: Using CMake directly
git clone https://github.com/nvidia/cuCascade.git
cd cuCascade
cmake --preset release
cmake --build build/release

# Run tests
pixi run test

# Run benchmarks (optional)
pixi run benchmarks

Requirements

  • OS/Arch: Linux (x86_64, aarch64)
  • Compiler: C++20 compatible compiler
  • Build Tools: CMake 4.1+, Ninja
  • GPU/Drivers: CUDA 13+, compatible NVIDIA driver
  • Dependencies: libcudf 25.10+

Usage

#include
#include
#include

using namespace cucascade::memory;

// 1. Discover hardware topology
topology_discovery discovery;
if (!discovery.discover()) {
// Handle discovery failure
}
auto const& topology = discovery.get_topology();

// 2. Configure the memory reservation manager
reservation_manager_configurator configurator;
configurator.set_gpu_usage_limit(4ULL << 30) // 4GB per GPU
.set_reservation_limit_ratio_per_gpu(0.8) // Reserve up to 80%
.set_capacity_per_numa_node(16ULL << 30) // 16GB per NUMA node
.bind_cpu_tier_to_gpus(); // Bind CPU tiers to GPUs

// 3. Create the manager with the discovered topology
auto configs = configurator.build(topology);
memory_reservation_manager manager(std::move(configs));

// 4. Request reservations using strategies
// Request 1GB on any available GPU
auto gpu_res = manager.request_reservation(any_memory_space_in_tier(Tier::GPU), 1ULL << 30);

// Request 2GB on a specific host NUMA node
auto host_res = manager.request_reservation(specific_memory_space(Tier::HOST, 0), 2ULL << 30);
  • More examples: See test/ directory for comprehensive usage examples

Documentation

Comprehensive documentation is available in the docs/ directory:

  • [Architecture Overview](docs/ARCHITECTURE.md): High-level description of the library's design, core components, and intended usage flows with Mermaid diagrams.
  • [API Reference](docs/API_REFERENCE.md): Detailed API documentation and class hierarchies automatically generated from code comments.

To regenerate the documentation:

pixi run docs

Contribution Guidelines

  • Start here: CONTRIBUTING.md
  • Code of Conduct: CODE_OF_CONDUCT.md
  • Development quickstart:
git clone https://github.com/nvidia/cuCascade.git
cd cuCascade
pixi install
pixi run build
pixi run build-debug # Debug mode
pixi run test

# Run benchmarks
pixi run benchmarks

Pre-commit Hooks

This project uses pre-commit for code quality checks including C++/CUDA formatting (clang-format), CMake linting, spell checking, and more.

# Run all checks manually
pixi run lint

# Install hooks to run automatically on every commit
pixi run lint-install

# Update hook versions
pre-commit autoupdate

Security

  • Vulnerability disclosure: SECURITY.md
  • Do not file public issues for security reports.
  • Report vulnerabilities via: https://www.nvidia.com/object/submit-security-vulnerability.html

Support

  • Level: Experimental
  • How to get help: GitHub Issues
  • For NVIDIA product security concerns: https://www.nvidia.com/en-us/security

Project Structure

cuCascade/
├── include/
│ ├── data/ # Data representation headers
│ │ ├── common.hpp # Common data utilities
│ │ ├── data_batch.hpp # Batch processing for data
│ │ ├── data_repository.hpp # Data storage abstraction
│ │ ├── data_repository_manager.hpp
│ │ ├── cpu_data_representation.hpp
│ │ └── gpu_data_representation.hpp
│ └── memory/ # Memory management headers
│ ├── common.hpp # Tier enum, memory_space_id, utilities
│ ├── memory_reservation_manager.hpp # Central reservation coordinator
│ ├── memory_reservation.hpp # Reservation types and policies
│ ├── memory_space.hpp # Memory space abstraction
│ ├── reservation_aware_resource_adaptor.hpp # GPU memory resource
│ ├── fixed_size_host_memory_resource.hpp # Host memory resource
│ ├── disk_access_limiter.hpp # Disk tier limiter
│ ├── reservation_manager_configurator.hpp # Builder for config
│ ├── topology_discovery.hpp # Hardware topology detection
│ ├── numa_region_pinned_host_allocator.hpp # NUMA-aware allocator
│ ├── notification_channel.hpp # Cross-reservation signaling
│ ├── stream_pool.hpp # CUDA stream management
│ └── oom_handling_policy.hpp # OOM handling strategies
├── src/
│ ├── data/ # Data representation implementation
│ └── memory/ # Memory management implementation
├── test/
│ ├── data/ # Data module tests
│ ├── memory/ # Memory module tests
│ └── utils/ # Test utilities (cuDF helpers)
├── benchmark/ # Performance benchmarks
│ ├── benchmark_representation_converter.cpp # Converter benchmarks
│ └── README.md # Benchmark documentation
├── cmake/ # CMake configuration modules
├── CMakeLists.txt # Main CMake configuration
├── CMakePresets.json # CMake presets for build configurations
└── pixi.toml # Pixi dependency management

References

License

This project is…

Excerpt shown — open the source for the full document.

Notability

notability 3.0/10

New repo by NVIDIA, low stars (49), routine release.