Deniz Kruppe
Optimizing the Energy Efficiency of YOLO-based Object Detection on FPGAs
Abstract
YOLO (You Only Look Once) is a family of state-of-the-art object detection systems that
allow frame rates above real time (> 30 FPS). Over the past decade, there has been
increasing interest in using FPGAs (Field Programmable Gate Arrays) for deep learning
applications. Given the sufficiently sophisticated algorithms of object detection, FPGAs
are typically much faster than CPUs, have a lower power consumption than GPUs and
allow for faster development time and lower development costs than ASICs (Application-
Specific Integrated Circuits) when flexible reconfiguration is desired. In this thesis,
different implementations of the Xilinx DPU (Deep Learning Processor Unit), a hardware
accelerator for deep learning, have been created via the Vitis-AI tool chain for a design
space exploration with different parameters. Additionally, YOLOv4 has been trained
with a custom image data set and deployed to the different hardware implementations.
The power consumption, used hardware resources, detection accuracy and processing
speed of the results were benchmarked.