Design Methodologies for FPGA-based Deep Learning Accelerators and Their Characterization
Mostra/ Apri
Creato da
Sestito, Cristian
Fortino, Giancarlo
Perri, Stefania
Corsonello, Pasquale
Metadata
Mostra tutti i dati dell'itemDescrizione
Formato
/
UNIVERSITA' DELLA CALABRIA
Dipartimento di Ingegneria Informatica, Modellistica, Elettronica e Sistemistica
Dottorato di Ricerca in
lnformation and Communication Technologies. Ciclo XXXV; Deep Neural Networks (DNNs) are widespread in many applications, including
computer vision, speech recognition and robotics, thanks to the ability of such models to
extract information by building a hierarchical representation of knowledge.
Image processing benefits from the latter behavior by using Convolutional Neural
Networks (CNNs), which consist of several Convolutional (CONV) layers to extract
features from inputs at different levels of abstraction. However, CNNs usually require
billions of computations to reach high accuracy levels. In order to sustain such
computational load, proper hardware acceleration is needed.
Field Programmable Gate Arrays (FPGAs) have been shown as promising
candidates, because they are able to achieve high throughput at limited power dissipation.
In addition, FPGAs are flexible architectures to accommodate several CNNs’ workloads.
While the hardware acceleration of conventional CNN models has been widely
investigated, the interest about more sophisticated tasks is still emerging. The latter
includes CNNs based on Dilated Convolutions (DCONVs) and Transposed Convolutions
(TCONVs), which deal with filter and image dilations, respectively. Accordingly, higher
computational complexity is exhibited by these architectures, thus requiring careful
hardware management.
This PhD dissertation deals with the FPGA acceleration of CNNs for Image
Processing based on DCONVs and TCONVs. Specifically, several designs using both the
Very High-Speed Integrated Circuits Hardware Description Language (VHDL) and the
High-Level Synthesis (HLS) are presented. Detailed characterization is discussed, based
on the evaluation of resources occupation, throughput, power dissipation, as well as the
impact of data quantization. Overall, the proposed circuits show noticeable energyefficiency
when compared to several state-of-the-art counterparts. For instance, hardware acceleration of run-time reconfigurable CONVs and TCONVs for super-resolution
imaging has shown an energy-efficiency of up to 518.5 GOPS/W, by outperforming stateof-
the-art competitors by up to 2.3 times.; La borsa di dottorato è stata cofinanziata con risorse del Programma Operativo Regionale Calabria
FSE/FESR 2014 – 2020 (CCI 2014IT16M2OP006)”Soggetto
FPGA
Relazione
ING-INF/01;