WritingMeta AI (Llama)Meta AI (Llama)published May 8, 2025seen 6d

Accelerating GPU indexes in Faiss with NVIDIA cuVS

Open original ↗

Captured source

source ↗
published May 8, 2025seen 6dcaptured 3dhttp 200method plain

Accelerating GPU indexes in Faiss with NVIDIA cuVS - Engineering at Meta

Skip to content

By Junjie Qi , Gergely Szilvasy , Michael Norris , Vishal Gandhi

Meta and NVIDIA collaborated to accelerate vector search on GPUs by integrating NVIDIA cuVS into Faiss v1.10 , Meta’s open source library for similarity search.

This new implementation of cuVS will be more performant than classic GPU-accelerated search in some areas.

For inverted file (IVF) indexing, NVIDIA cuVS outperforms classical GPU-accelerated IVF build times by up to 4.7x; and search latency is reduced by as much as 8.1x.

For graph indexing, CUDA ANN Graph (CAGRA) outperforms CPU Hierarchical Navigable Small World graphs (HNSW) build times by up to 12.3x; and search latency is reduced by as much as 4.7x.

The Faiss library

The Faiss library is an open source library, developed by Meta FAIR, for efficient vector search and clustering of dense vectors. Faiss pioneered vector search on GPUs, as well as the ability to seamlessly switch between GPUs and CPUs. It has made a lasting impact in both research and industry, being used as an integrated library in several databases (e.g., Milvus and OpenSearch), machine learning libraries, data processing libraries, and AI workflows. Faiss is also used heavily by researchers and data scientists as a standalone library, often paired with PyTorch .

Collaboration with NVIDIA

Three years ago, Meta and NVIDIA worked together to enhance the capabilities of vector search technology and to accelerate vector search on GPUs. Previously, in 2016, Meta had incorporated high performing vector search algorithms made for NVIDIA GPUs: GpuIndexFlat ; GpuIndexIVFFlat ; GpuIndexIVFPQ . After the partnership, NVIDIA rapidly contributed GpuIndexCagra , a state-of-the art graph-based index designed specifically for GPUs. In its latest release, Faiss 1.10.0 officially includes these algorithms from the NVIDIA cuVS library .

Faiss 1.10.0 also includes a new conda package that unlocks the ability to choose between the classic Faiss GPU implementations and the newer NVIDIA cuVS algorithms , making it easy for users to switch between GPU and CPU.

Benchmarking

The following benchmarks were conducted using the cuVS-bench tool.

We measured:

A tall, slender image dataset: A subset of 100 million vectors from the Deep1B dataset by 96 dimensions.

A short, wide dataset of text embeddings: 5 million vector embeddings, curated using the OpenAI text-embedding-ada-002 model .

Tests for index build times and search latency were conducted on an NVIDIA H100 GPU and compared to an Intel Xeon Platinum 8480CL system. Results are reported in the tables below at 95% recall along the pareto frontiers for k=10 nearest neighbors.

Build time (95% recall@10)

Index

Embeddings

100M x 96

(seconds)

Embeddings

5M x 1536

(seconds)

Faiss Classic Faiss cuVS Faiss Classic Faiss cuVS Faiss Classic Faiss cuVS

IVF Flat IVF Flat 101.4 37.9 (2.7x) 24.4 15.2 (1.6x)

IVF PQ IVF PQ 168.2 72.7 (2.3x) 42.0 9.0 (4.7x)

HNSW (CPU) CAGRA 3322.1 518.5 (6.4x) 1106.1 89.7 (12.3x)

Table 1: Index build times for Faiss-classic and Faiss-cuVS in seconds (with NVIDIA cuVS speedups in parentheses).

Search latency (95% recall@10)

Index

Embeddings

100M x 96

(milliseconds)

Embeddings

5M x 1536

(milliseconds)

Faiss Classic Faiss cuVS Faiss Classic Faiss cuVS Faiss Classic Faiss cuVS

IVF Flat IVF Flat 0.75 0.39 (1.9x) 1.98 1.14 (1.7x)

IVF PQ IVF PQ 0.49 0.17 (2.9x) 1.78 0.22 (8.1x)

HNSW (CPU) CAGRA 0.56 0.23 (2.4x) 0.71 0.15 (4.7x)

Table 2: Online (i.e., one at a time) search query latency for Faiss-classic and Faiss-cuVS in milliseconds (with NVIDIA cuVS speedups in parentheses).

Looking forward

The emergence of state-of-the-art NVIDIA GPUs has revolutionized the field of vector search, enabling high recall and lightning-fast search speeds. The integration of Faiss and cuVS will continue to incorporate state-of-the-art algorithms, and we look forward to unlocking new innovations in this partnership between Meta and NVIDIA.

Read here for more details about NVIDIA cuVS .

Share this:

Share on Facebook (Opens in new window) Facebook

Share on Threads (Opens in new window) Threads

Share on WhatsApp (Opens in new window) WhatsApp

Share on LinkedIn (Opens in new window) LinkedIn

Share on Reddit (Opens in new window) Reddit

Share on X (Opens in new window) X

Share on Bluesky (Opens in new window) Bluesky

Share on Mastodon (Opens in new window) Mastodon

Share on Hacker News (Opens in new window) Hacker News

Email a link to a friend (Opens in new window) Email

Read More in AI Research

View All

-->

Available Positions

UX Researcher, Quantitative

MENLO PARK, US

UX Researcher, Quantitative

SEATTLE, US

UX Researcher, Quantitative

NEW YORK, US

UX Researcher, Quantitative

REMOTE, US

AI Research Engineer - Social Products (Technical Leadership)

BELLEVUE, US

See All Jobs

Technology at Meta

Engineering at Meta - X

Follow

AI at Meta

Read

Meta Quest Blog

Read

Meta for Developers

Read

Meta Bug Bounty

Learn more

RSS

Subscribe

Open Source

Meta believes in building community through open source technology. Explore our latest projects in Artificial Intelligence, Data Infrastructure, Development Tools, Front End, Languages, Platforms, Security, Virtual Reality, and more.

ANDROID

iOS

WEB

BACKEND

HARDWARE

Learn More

To help personalize content, tailor and measure ads and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookie Policy

Accept

Notability

notability 6.0/10

Notable integration post for Faiss and cuVS