RepoNVIDIANVIDIApublished Nov 20, 2023seen 1d

NVIDIA/cuvs

Cuda

Open original ↗

Captured source

source ↗
published Nov 20, 2023seen 1dcaptured 1dhttp 200method plain

NVIDIA/cuvs

Description: cuVS - a library for vector search and clustering on the GPU

Language: Cuda

License: Apache-2.0

Stars: 790

Forks: 196

Open issues: 629

Created: 2023-11-20T15:42:25Z

Pushed: 2026-06-25T05:11:56Z

Default branch: main

Fork: no

Archived: no

README:

cuVS: Vector Search and Clustering on the GPU

Contents

1. [Useful Resources](#useful-resources) 2. [What is cuVS?](#what-is-cuvs) 3. [Installing cuVS](#installing-cuvs) 4. [Getting Started](#getting-started) 5. [Contributing](#contributing) 6. [References](#references)

Useful Resources

What is cuVS?

cuVS contains state-of-the-art implementations of several algorithms for running approximate nearest neighbors and clustering on the GPU. It can be used directly or through the various databases and other libraries that have integrated it. The primary goal of cuVS is to simplify the use of GPUs for vector similarity search and clustering.

Vector search is an information retrieval method that has been growing in popularity over the past few years, partly because of the rising importance of multimedia embeddings created from unstructured data and the need to perform semantic search on the embeddings to find items which are semantically similar to each other.

Vector search is also used in _data mining and machine learning_ tasks and comprises an important step in many _clustering_ and _visualization_ algorithms like UMAP, t-SNE, K-means, and HDBSCAN.

Finally, faster vector search enables interactions between dense vectors and graphs. Converting a pile of dense vectors into nearest neighbors graphs unlocks the entire world of graph analysis algorithms, such as those found in GraphBLAS and cuGraph.

Below are some common use-cases for vector search

  • ### Semantic search
  • Generative AI & Retrieval augmented generation (RAG)
  • Recommender systems
  • Computer vision
  • Image search
  • Text search
  • Audio search
  • Molecular search
  • Model training
  • ### Data mining
  • Clustering algorithms
  • Visualization algorithms
  • Sampling algorithms
  • Class balancing
  • Ensemble methods
  • k-NN graph construction

Why cuVS?

There are several benefits to using cuVS and GPUs for vector search, including

1. Fast index build 2. Latency critical and high throughput search 3. Parameter tuning 4. Cost savings 5. Interoperability (build on GPU, deploy on CPU) 6. Multiple language support 7. Building blocks for composing new or accelerating existing algorithms

In addition to the items above, cuVS shoulders the burden of keeping non-trivial accelerated code up to date as new NVIDIA architectures and CUDA versions are released. This provides a delightful development experience, guaranteeing that any libraries, databases, or applications built on top of it will always be getting the best performance and scale.

cuVS Technology Stack

cuVS is built on top of the RAPIDS RAFT library of high performance machine learning primitives and provides all the necessary routines for vector search and clustering on the GPU.

![cuVS is built on top of low-level CUDA libraries and provides many important routines that enable vector search and clustering on the GPU](img/tech_stack.png "cuVS Technology Stack")

Installing cuVS

cuVS comes with pre-built packages that can be installed through conda and pip or tarball. Different packages are available for the different languages supported by cuVS.

> [!NOTE] > If compiled binary size is a concern, please note that the cuVS builds for CUDA 13 are roughly half the size of CUDA 12 builds. This is a result of improved compression rates in the newer supported CUDA drivers. We will be adopting the newer drivers for CUDA 12 builds in Spring of 2026, which will ultimately bring them down to roughly the size of the CUDA 13 builds. In the meantime, the NVIDIA cuVS team is continuing to shave down the binary sizes for all supported CUDA versions. If binary size is an issue for you, please consider linking to cuVS statically either by building from source or using pre-built libcuvs-static conda package.

Please see the Build and Install Guide for more information on installing the available cuVS packages and building from source.

Getting Started

The following code snippets train an approximate nearest neighbors index for the CAGRA algorithm in the various different languages supported by cuVS.

Python API

from cuvs.neighbors import cagra

dataset = load_data()
index_params = cagra.IndexParams()

index = cagra.build(index_params, dataset)

C++ API

#include

using namespace cuvs::neighbors;

raft::device_matrix_view dataset = load_dataset();
raft::device_resources res;

cagra::index_params index_params;

auto index = cagra::build(res, index_params, dataset);

For more code examples of the C++ APIs, including drop-in Cmake project templates, please refer to the C++ examples directory in the codebase.

C API

#include

cuvsResources_t res;
cuvsCagraIndexParams_t index_params;
cuvsCagraIndex_t index;

DLManagedTensor *dataset;
load_dataset(dataset);

cuvsResourcesCreate(&res);
cuvsCagraIndexParamsCreate(&index_params);
cuvsCagraIndexCreate(&index);

cuvsCagraBuild(res,...

Excerpt shown — open the source for the full document.

Notability

notability 6.0/10

New NVIDIA repo with 790 stars, notable but not major.