Design Methodologies for FPGA-based Deep Learning Accelerators and Their Characterization

Sestito, Cristian; Fortino, Giancarlo; Perri, Stefania; Corsonello, Pasquale

Mostra/Apri

(4.868Mb)

Creato da

Sestito, Cristian

Fortino, Giancarlo

Perri, Stefania

Corsonello, Pasquale

Metadata

Mostra tutti i dati dell'item

URI

https://hdl.handle.net/10955/5619
https://doi.org/10.13126/unical.it/dottorati/5619

Descrizione

Formato

UNIVERSITA' DELLA CALABRIA Dipartimento di Ingegneria Informatica, Modellistica, Elettronica e Sistemistica Dottorato di Ricerca in lnformation and Communication Technologies. Ciclo XXXV; Deep Neural Networks (DNNs) are widespread in many applications, including computer vision, speech recognition and robotics, thanks to the ability of such models to extract information by building a hierarchical representation of knowledge. Image processing benefits from the latter behavior by using Convolutional Neural Networks (CNNs), which consist of several Convolutional (CONV) layers to extract features from inputs at different levels of abstraction. However, CNNs usually require billions of computations to reach high accuracy levels. In order to sustain such computational load, proper hardware acceleration is needed. Field Programmable Gate Arrays (FPGAs) have been shown as promising candidates, because they are able to achieve high throughput at limited power dissipation. In addition, FPGAs are flexible architectures to accommodate several CNNs’ workloads. While the hardware acceleration of conventional CNN models has been widely investigated, the interest about more sophisticated tasks is still emerging. The latter includes CNNs based on Dilated Convolutions (DCONVs) and Transposed Convolutions (TCONVs), which deal with filter and image dilations, respectively. Accordingly, higher computational complexity is exhibited by these architectures, thus requiring careful hardware management. This PhD dissertation deals with the FPGA acceleration of CNNs for Image Processing based on DCONVs and TCONVs. Specifically, several designs using both the Very High-Speed Integrated Circuits Hardware Description Language (VHDL) and the High-Level Synthesis (HLS) are presented. Detailed characterization is discussed, based on the evaluation of resources occupation, throughput, power dissipation, as well as the impact of data quantization. Overall, the proposed circuits show noticeable energyefficiency when compared to several state-of-the-art counterparts. For instance, hardware acceleration of run-time reconfigurable CONVs and TCONVs for super-resolution imaging has shown an energy-efficiency of up to 518.5 GOPS/W, by outperforming stateof- the-art competitors by up to 2.3 times.; La borsa di dottorato è stata cofinanziata con risorse del Programma Operativo Regionale Calabria FSE/FESR 2014 – 2020 (CCI 2014IT16M2OP006)”

Soggetto

FPGA

Relazione

ING-INF/01;