Accelerating GPU indexes in Faiss with NVIDIA cuVS
Captured source
source ↗Accelerating GPU indexes in Faiss with NVIDIA cuVS - Engineering at Meta
Skip to content
By Junjie Qi , Gergely Szilvasy , Michael Norris , Vishal Gandhi
Meta and NVIDIA collaborated to accelerate vector search on GPUs by integrating NVIDIA cuVS into Faiss v1.10 , Meta’s open source library for similarity search.
This new implementation of cuVS will be more performant than classic GPU-accelerated search in some areas.
For inverted file (IVF) indexing, NVIDIA cuVS outperforms classical GPU-accelerated IVF build times by up to 4.7x; and search latency is reduced by as much as 8.1x.
For graph indexing, CUDA ANN Graph (CAGRA) outperforms CPU Hierarchical Navigable Small World graphs (HNSW) build times by up to 12.3x; and search latency is reduced by as much as 4.7x.
The Faiss library
The Faiss library is an open source library, developed by Meta FAIR, for efficient vector search and clustering of dense vectors. Faiss pioneered vector search on GPUs, as well as the ability to seamlessly switch between GPUs and CPUs. It has made a lasting impact in both research and industry, being used as an integrated library in several databases (e.g., Milvus and OpenSearch), machine learning libraries, data processing libraries, and AI workflows. Faiss is also used heavily by researchers and data scientists as a standalone library, often paired with PyTorch .
Collaboration with NVIDIA
Three years ago, Meta and NVIDIA worked together to enhance the capabilities of vector search technology and to accelerate vector search on GPUs. Previously, in 2016, Meta had incorporated high performing vector search algorithms made for NVIDIA GPUs: GpuIndexFlat ; GpuIndexIVFFlat ; GpuIndexIVFPQ . After the partnership, NVIDIA rapidly contributed GpuIndexCagra , a state-of-the art graph-based index designed specifically for GPUs. In its latest release, Faiss 1.10.0 officially includes these algorithms from the NVIDIA cuVS library .
Faiss 1.10.0 also includes a new conda package that unlocks the ability to choose between the classic Faiss GPU implementations and the newer NVIDIA cuVS algorithms , making it easy for users to switch between GPU and CPU.
Benchmarking
The following benchmarks were conducted using the cuVS-bench tool.
We measured:
A tall, slender image dataset: A subset of 100 million vectors from the Deep1B dataset by 96 dimensions.
A short, wide dataset of text embeddings: 5 million vector embeddings, curated using the OpenAI text-embedding-ada-002 model .
Tests for index build times and search latency were conducted on an NVIDIA H100 GPU and compared to an Intel Xeon Platinum 8480CL system. Results are reported in the tables below at 95% recall along the pareto frontiers for k=10 nearest neighbors.
Build time (95% recall@10)
Index
Embeddings
100M x 96
(seconds)
Embeddings
5M x 1536
(seconds)
Faiss Classic Faiss cuVS Faiss Classic Faiss cuVS Faiss Classic Faiss cuVS
IVF Flat IVF Flat 101.4 37.9 (2.7x) 24.4 15.2 (1.6x)
IVF PQ IVF PQ 168.2 72.7 (2.3x) 42.0 9.0 (4.7x)
HNSW (CPU) CAGRA 3322.1 518.5 (6.4x) 1106.1 89.7 (12.3x)
Table 1: Index build times for Faiss-classic and Faiss-cuVS in seconds (with NVIDIA cuVS speedups in parentheses).
Search latency (95% recall@10)
Index
Embeddings
100M x 96
(milliseconds)
Embeddings
5M x 1536
(milliseconds)
Faiss Classic Faiss cuVS Faiss Classic Faiss cuVS Faiss Classic Faiss cuVS
IVF Flat IVF Flat 0.75 0.39 (1.9x) 1.98 1.14 (1.7x)
IVF PQ IVF PQ 0.49 0.17 (2.9x) 1.78 0.22 (8.1x)
HNSW (CPU) CAGRA 0.56 0.23 (2.4x) 0.71 0.15 (4.7x)
Table 2: Online (i.e., one at a time) search query latency for Faiss-classic and Faiss-cuVS in milliseconds (with NVIDIA cuVS speedups in parentheses).
Looking forward
The emergence of state-of-the-art NVIDIA GPUs has revolutionized the field of vector search, enabling high recall and lightning-fast search speeds. The integration of Faiss and cuVS will continue to incorporate state-of-the-art algorithms, and we look forward to unlocking new innovations in this partnership between Meta and NVIDIA.
Read here for more details about NVIDIA cuVS .
Share this:
Share on Facebook (Opens in new window) Facebook
Share on Threads (Opens in new window) Threads
Share on WhatsApp (Opens in new window) WhatsApp
Share on LinkedIn (Opens in new window) LinkedIn
Share on Reddit (Opens in new window) Reddit
Share on X (Opens in new window) X
Share on Bluesky (Opens in new window) Bluesky
Share on Mastodon (Opens in new window) Mastodon
Share on Hacker News (Opens in new window) Hacker News
Email a link to a friend (Opens in new window) Email
Read More in AI Research
View All
-->
Available Positions
UX Researcher, Quantitative
MENLO PARK, US
UX Researcher, Quantitative
SEATTLE, US
UX Researcher, Quantitative
NEW YORK, US
UX Researcher, Quantitative
REMOTE, US
AI Research Engineer - Social Products (Technical Leadership)
BELLEVUE, US
See All Jobs
Technology at Meta
Engineering at Meta - X
Follow
AI at Meta
Read
Meta Quest Blog
Read
Meta for Developers
Read
Meta Bug Bounty
Learn more
RSS
Subscribe
Open Source
Meta believes in building community through open source technology. Explore our latest projects in Artificial Intelligence, Data Infrastructure, Development Tools, Front End, Languages, Platforms, Security, Virtual Reality, and more.
ANDROID
iOS
WEB
BACKEND
HARDWARE
Learn More
To help personalize content, tailor and measure ads and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookie Policy
Accept
Notability
notability 6.0/10Notable integration post for Faiss and cuVS