What does this repo signal mean?

NVIDIA published NVIDIA/k8s-device-plugin (Go). This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo NVIDIA/k8s-device-plugin · language Go. onlylabs links this event to 1 captured evidence page and 6 related repo signals.

NVIDIA Repo: NVIDIA/k8s-device-plugin

Captured source

source ↗

GitHub/github.com/NVIDIA/k8s-device-plugin

NVIDIA/k8s-device-plugin repository metadata

Source ↗

published Oct 10, 2017seen 3dcaptured 13hhttp 200method plain

NVIDIA/k8s-device-plugin

Description: NVIDIA device plugin for Kubernetes

Language: Go

License: Apache-2.0

Stars: 3785

Forks: 830

Open issues: 73

Created: 2017-10-10T21:31:02Z

Pushed: 2026-06-10T18:17:49Z

Default branch: main

Fork: no

Archived: no

README:

NVIDIA device plugin for Kubernetes

[About](#about)
[Prerequisites](#prerequisites)
[Quick Start](#quick-start)
[Preparing your GPU Nodes](#preparing-your-gpu-nodes)
[Example for debian-based systems with docker and containerd](#example-for-debian-based-systems-with-docker-and-containerd)
[Install the NVIDIA Container Toolkit](#install-the-nvidia-container-toolkit)
[Notes on CRI-O configuration](#notes-on-cri-o-configuration)
[Enabling GPU Support in Kubernetes](#enabling-gpu-support-in-kubernetes)
[Running GPU Jobs](#running-gpu-jobs)
[Configuring the NVIDIA device plugin binary](#configuring-the-nvidia-device-plugin-binary)
[As command line flags or envvars](#as-command-line-flags-or-envvars)
[As a configuration file](#as-a-configuration-file)
[Configuration Option Details](#configuration-option-details)
[Shared Access to GPUs](#shared-access-to-gpus)
[With CUDA Time-Slicing](#with-cuda-time-slicing)
[With CUDA MPS](#with-cuda-mps)
[IMEX Support](#imex-support)
[Catalog of Labels](#catalog-of-labels)
[Deployment via helm](#deployment-via-helm)
[Configuring the device plugin's helm chart](#configuring-the-device-plugins-helm-chart)
[Passing configuration to the plugin via a ConfigMap](#passing-configuration-to-the-plugin-via-a-configmap)
[Single Config File Example](#single-config-file-example)
[Multiple Config File Example](#multiple-config-file-example)
[Updating Per-Node Configuration With a Node Label](#updating-per-node-configuration-with-a-node-label)
[Setting other helm chart values](#setting-other-helm-chart-values)
[Deploying with gpu-feature-discovery for automatic node labels](#deploying-with-gpu-feature-discovery-for-automatic-node-labels)
[Deploying gpu-feature-discovery in standalone mode](#deploying-gpu-feature-discovery-in-standalone-mode)
[Deploying via helm install with a direct URL to the helm package](#deploying-via-helm-install-with-a-direct-url-to-the-helm-package)
[Building and Running Locally](#building-and-running-locally)
Advanced Topics
[Using CDI](#docs/cdi/md)
[With Docker](#with-docker)
[Build](#build)
[Run](#run)
[Without Docker](#without-docker)
[Build](#build-1)
[Run](#run-1)

[Changelog](#changelog)
[Issues and Contributing](#issues-and-contributing)
[Versioning](#versioning)
[Upgrading Kubernetes with the Device Plugin](#upgrading-kubernetes-with-the-device-plugin)

About

The NVIDIA device plugin for Kubernetes is a Daemonset that allows you to automatically:

Expose the number of GPUs on each nodes of your cluster
Keep track of the health of your GPUs
Run GPU enabled containers in your Kubernetes cluster.

This repository contains NVIDIA's official implementation of the Kubernetes device plugin. As of v0.15.0 this repository also holds the implementation for GPU Feature Discovery labels, for further information on GPU Feature Discovery see [here](docs/gpu-feature-discovery/README.md).

Please note that:

The NVIDIA device plugin API is beta as of Kubernetes v1.10.
The NVIDIA device plugin is currently lacking
Comprehensive GPU health checking features
GPU cleanup features
Support will only be provided for the official NVIDIA device plugin (and not

for forks or other variants of this plugin).

Prerequisites

The list of prerequisites for running the NVIDIA device plugin is described below:

NVIDIA drivers ~= 384.81
nvidia-docker >= 2.0 || nvidia-container-toolkit >= 1.7.0 (>= 1.11.0 to use integrated GPUs on Tegra-based systems)
nvidia-container-runtime configured as the default low-level runtime
Kubernetes version >= 1.10

Quick Start

Preparing your GPU Nodes

The following steps need to be executed on all your GPU nodes. This README assumes that the NVIDIA drivers and the nvidia-container-toolkit have been pre-installed. It also assumes that you have configured the nvidia-container-runtime as the default low-level runtime to use.

Please see: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html

Example for debian-based systems with `docker` and `containerd`

##### Install the NVIDIA Container Toolkit

For instructions on installing and getting started with the NVIDIA Container Toolkit, refer to the installation guide.

Also note the configuration instructions for:

Remembering to restart each runtime after applying the configuration changes.

If the nvidia runtime should be set as the default runtime (with non-cri docker versions, for example), the --set-as-default argument must also be included in the commands above. If this is not done, a RuntimeClass needs to be defined:

apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: nvidia
handler: nvidia

##### Notes on CRI-O configuration

When running kubernetes with CRI-O, add the config file to set the nvidia-container-runtime as the default low-level OCI runtime under /etc/crio/crio.conf.d/99-nvidia.conf. This will take priority over the default crun config file at /etc/crio/crio.conf.d/10-crun.conf:

[crio]

[crio.runtime]
default_runtime = "nvidia"

[crio.runtime.runtimes]

[crio.runtime.runtimes.nvidia]
runtime_path = "/usr/bin/nvidia-container-runtime"
runtime_type = "oci"

As stated in the linked documentation, this file can automatically be generated with the nvidia-ctk command:

sudo nvidia-ctk runtime configure --runtime=crio --set-as-default --config=/etc/crio/crio.conf.d/99-nvidia.conf

CRI-O uses crun as default low-level OCI runtime so crun…

Excerpt shown — open the source for the full document.