How to Enable CUDA GPU Acceleration for face_recognition

face_recognition is a powerful and easy-to-use Python library for facial recognition, supporting tasks like face detection, feature encoding, and comparison. However, when installed via pip or conda, it defaults to CPU-only mode.

Since the core computation of face_recognition relies on dlib, you must manually compile a version of dlib with CUDA and cuDNN support to unlock GPU acceleration. Here is the step-by-step guide.

Core: Compiling dlib with CUDA

To enable CUDA support, follow these steps:

1. Prepare the Environment

Create a new Conda environment:

$ conda create -n dlib python=3.8 cmake ipython
$ conda activate dlib

2. Install CUDA and cuDNN

Install the necessary CUDA toolkits using the NVIDIA channel:

$ conda install cuda cudnn -c nvidia

Note the path to nvcc in your current environment; you will need this for the build process:

$ which nvcc
/path/to/your/miniconda3/envs/dlib/bin/

3. Build dlib from Source

Clone and build dlib with CUDA support:

$ git clone https://github.com/davisking/dlib.git
$ cd dlib
$ mkdir build
$ cd build
$ cmake .. -DDLIB_USE_CUDA=1 -DUSE_AVX_INSTRUCTIONS=1 -DCUDAToolkit_ROOT=/path/to/your/miniconda3/envs/dlib/bin/
$ cmake --build .
$ cd cd ..
$ python setup.py install --set DLIB_USE_CUDA=1

4. Verify the Installation

Check if CUDA is enabled within Python:

(dlib) $ ipython
In [1]: import dlib
In [2]: dlib.DLIB_USE_CUDA
Out[2]: True
In [3]: print(dlib.cuda.get_num_devices())
1

If the output confirms True, dlib (and consequently face_recognition) is now utilizing your GPU. I have verified this workflow within a Docker container (Ubuntu 20.04, CUDA 11.3, cuDNN 8.2.1, Python 3.8).

Discussion

Simplicity

The strength of face_recognition lies in its minimal API. Face detection can be achieved in just a few lines:

import face_recognition
image = face_recognition.load_image_file("your_file.jpg")
face_locations = face_recognition.face_locations(image)

Comparing two faces is equally straightforward:

import face_recognition

picture_of_me = face_recognition.load_image_file("me.jpg")
my_face_encoding = face_recognition.face_encodings(picture_of_me)[0]

unknown_picture = face_recognition.load_image_file("unknown.jpg")
unknown_face_encoding = face_recognition.face_encodings(unknown_picture)[0]

results = face_recognition.compare_faces([my_face_encoding], unknown_face_encoding)

if results[0]:
    print("It's a picture of me!")
else:
    print("It's not a picture of me!")

Accuracy

While face_recognition is often cited as “the world’s simplest face recognition library,” its dlib-based model achieves 99.38% accuracy on the LFW dataset. While respectable, it may not be sufficient for high-stakes applications. If higher precision is required, consider insightface, which frequently scores above 99.7% on LFW.

Summary

face_recognition is excellent for rapid prototyping, but its default CPU mode can struggle with high-latency processing when handling multiple faces or real-time video streams. By recompiling dlib with CUDA support, you can offload heavy computations to your GPU, significantly improving performance.

I recommend using Linux for these deployments. If you are concerned about polluting your local environment, Docker is a perfect choice, as it supports GPU passthrough and allows for easy integration with cameras for real-time testing.

References: