How to Enable CUDA GPU Acceleration for face_recognition
face_recognition is a powerful and easy-to-use Python library for facial recognition, supporting tasks like face detection, feature encoding, and comparison. However, when installed via pip or conda, it defaults to CPU-only mode.
Since the core computation of face_recognition relies on dlib, you must manually compile a version of dlib with CUDA and cuDNN support to unlock GPU acceleration. Here is the step-by-step guide.
Core: Compiling dlib with CUDA
To enable CUDA support, follow these steps:
1. Prepare the Environment
Create a new Conda environment:
$ conda create -n dlib python=3.8 cmake ipython
$ conda activate dlib
2. Install CUDA and cuDNN
Install the necessary CUDA toolkits using the NVIDIA channel:
$ conda install cuda cudnn -c nvidia
Note the path to nvcc in your current environment; you will need this for the build process:
$ which nvcc
/path/to/your/miniconda3/envs/dlib/bin/
3. Build dlib from Source
Clone and build dlib with CUDA support:
$ git clone https://github.com/davisking/dlib.git
$ cd dlib
$ mkdir build
$ cd build
$ cmake .. -DDLIB_USE_CUDA=1 -DUSE_AVX_INSTRUCTIONS=1 -DCUDAToolkit_ROOT=/path/to/your/miniconda3/envs/dlib/bin/
$ cmake --build .
$ cd cd ..
$ python setup.py install --set DLIB_USE_CUDA=1
4. Verify the Installation
Check if CUDA is enabled within Python:
(dlib) $ ipython
In [1]: import dlib
In [2]: dlib.DLIB_USE_CUDA
Out[2]: True
In [3]: print(dlib.cuda.get_num_devices())
1
If the output confirms True, dlib (and consequently face_recognition) is now utilizing your GPU. I have verified this workflow within a Docker container (Ubuntu 20.04, CUDA 11.3, cuDNN 8.2.1, Python 3.8).
Discussion
Simplicity
The strength of face_recognition lies in its minimal API. Face detection can be achieved in just a few lines:
import face_recognition
image = face_recognition.load_image_file("your_file.jpg")
face_locations = face_recognition.face_locations(image)
Comparing two faces is equally straightforward:
import face_recognition
picture_of_me = face_recognition.load_image_file("me.jpg")
my_face_encoding = face_recognition.face_encodings(picture_of_me)[0]
unknown_picture = face_recognition.load_image_file("unknown.jpg")
unknown_face_encoding = face_recognition.face_encodings(unknown_picture)[0]
results = face_recognition.compare_faces([my_face_encoding], unknown_face_encoding)
if results[0]:
print("It's a picture of me!")
else:
print("It's not a picture of me!")
Accuracy
While face_recognition is often cited as “the world’s simplest face recognition library,” its dlib-based model achieves 99.38% accuracy on the LFW dataset. While respectable, it may not be sufficient for high-stakes applications. If higher precision is required, consider insightface, which frequently scores above 99.7% on LFW.
Summary
face_recognition is excellent for rapid prototyping, but its default CPU mode can struggle with high-latency processing when handling multiple faces or real-time video streams. By recompiling dlib with CUDA support, you can offload heavy computations to your GPU, significantly improving performance.
I recommend using Linux for these deployments. If you are concerned about polluting your local environment, Docker is a perfect choice, as it supports GPU passthrough and allows for easy integration with cameras for real-time testing.
References: