Docker containers don’t see your system’s GPU automatically. This causes reduced performance in GPU-dependent workloads such as machine learning frameworks. Here’s how to expose your host’s NVIDIA GPU to your containers.

Making GPUs Work In Docker

Docker containers share your host’s kernel but bring along their own operating system and software packages. This means they lack the NVIDIA drivers used to interface with your GPU. Docker doesn’t even add GPUs to containers by default so a plain docker run won’t see your hardware at all.

At a high level, getting your GPU to work is a two-step procedure: install the drivers within your image, then instruct Docker to add GPU devices to your containers at runtime.

This guide focuses on modern versions of CUDA and Docker. The latest release of NVIDIA Container Toolkit is designed for combinations of CUDA 10 and Docker Engine 19.03 and later. Older builds of CUDA, Docker, and the NVIDIA drivers may require additional steps.

Adding the NVIDIA Drivers

Make sure you’ve got the NVIDIA drivers working properly on your host before you continue with your Docker configuration. You should be able to successfully run nvidia-smi and see your GPU’s name, driver version, and CUDA version.

To use your GPU with Docker, begin by adding the NVIDIA Container Toolkit to your host. This integrates into Docker Engine to automatically configure your containers for GPU support.

Add the toolkit’s package repository to your system using the example command:

Next install the nvidia-docker2 package on your host:

Restart the Docker daemon to complete the installation:

The Container Toolkit should now be operational. You’re ready to start a test container.

Starting a Container With GPU Access

As Docker doesn’t provide your system’s GPUs by default, you need to create containers with the –gpus flag for your hardware to show up. You can either specify specific devices to enable or use the all keyword.

The nvidia/cuda images are preconfigured with the CUDA binaries and GPU tools. Start a container and run the nvidia-smi command to check your GPU’s accessible. The output should match what you saw when using nvidia-smi on your host. The CUDA version could be different depending on the toolkit versions on your host and in your selected container image.

Selecting a Base Image

Using one of the nvidia/cuda tags is the quickest and easiest way to get your GPU workload running in Docker. Many different variants are available; they provide a matrix of operating system, CUDA version, and NVIDIA software options. The images are built for multiple architectures.

Each tag has this format:

11. 4. 0 – CUDA version. base – Image flavor. ubuntu20. 04 – Operating system version.

Three different image flavors are available. The base image is a minimal option with the essential CUDA runtime binaries. runtime is a more fully-featured option that includes the CUDA math libraries and NCCL for cross-GPU communication. The third variant is devel which gives you everything from runtime as well as headers and development tools for creating custom CUDA images.

If one of the images will work for you, aim to use it as your base in your Dockerfile. You can then use regular Dockerfile instructions to install your programming languages, copy in your source code, and configure your application. It removes the complexity of manual GPU set up steps.

Building and running this image with the –gpus flag would start your Tensor workload with GPU acceleration.

Manually Configuring an Image

You can manually add CUDA support to your image if you need to choose a different base. The best way to achieve this is to reference the official NVIDIA Dockerfiles.

Copy the instructions used to add the CUDA package repository, install the library, and link it into your path. We’re not reproducing all the steps in this guide as they vary by CUDA version and operating system.

Pay attention to the environment variables at the end of the Dockerfile – these define how containers using your image integrate with the NVIDIA Container Runtime:

Your image should detect your GPU once CUDA’s installed and the environment variables have been set. This gives you more control over the contents of your image but leaves you liable to adjust the instructions as new CUDA versions release.

How Does It Work?

The NVIDIA Container Toolkit is a collection of packages which wrap container runtimes like Docker with an interface to the NVIDIA driver on the host. The libnvidia-container library is responsible for providing an API and CLI that automatically provides your system’s GPUs to containers via the runtime wrapper.

The nvidia-container-toolkit component implements a container runtime prestart hook. This means it’s notified when a new container is about to start. It looks at the GPUs you want to attach and invokes libnvidia-container to handle container creation.

The hook is enabled by nvidia-container-runtime. This wraps your “real” container runtime such as containerd or runc to ensure the NVIDIA prestart hook is run. Your existing runtime continues the container start process after the hook has executed. When the container toolkit is installed, you’ll see the NVIDIA runtime selected in your Docker daemon config file.

Summary

Using an NVIDIA GPU inside a Docker container requires you to add the NVIDIA Container Toolkit to the host. This integrates the NVIDIA drivers with your container runtime.

Calling docker run with the –gpu flag makes your hardware visible to the container. This must be set on each container you launch, after the Container Toolkit has been installed.

NVIDIA provides preconfigured CUDA Docker images that you can use as a quick starter for your application. If you need something more specific, refer to the official Dockerfiles to assemble your own that’s still compatible with the Container Toolkit.