A Flexible FPGA-based Inference Architecture for Pruned Deep Neural Networks

Thorbjörn Posewsky, Daniel Ziener

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

2 Citations (Scopus)
5 Downloads (Pure)

Abstract

In this paper, we present an architecture for embedded FPGA-based deep neural network inference which is able to handle pruned weight matrices. Pruning of weights and even entire neurons reduces the amount of data and calculations significantly, thus improving enormously the efficiency and performance of the neural network inference in embedded devices. By using an HLS approach, the architecture is easily extendable and highly configurable with a free choice of parameters like the number of MAC units or the used activation function. For large neural networks, our approach competes with at least comparable performance as state-of-the-art x86-based software implementations while only using 10% of the energy.
Original languageEnglish
Title of host publicationArchitecture of Computing Systems – ARCS 2018
Subtitle of host publication31st International Conference, Braunschweig, Germany, April 9–12, 2018, Proceedings
EditorsMladen Berekovic, Rainer Buchty, Heiko Hamann, Dirk Koch, Thilo Pionteck
Place of PublicationBraunschweig, Germany
PublisherSpringer
Pages311-323
Number of pages13
ISBN (Electronic)978-3-319-77610-1
ISBN (Print)978-3-319-77609-5
DOIs
Publication statusPublished - 1 May 2018
Externally publishedYes
Event31st International Conference on Architecture of Computing Systems 2018 - Technical University of Braunschweig, Braunschweig, Germany
Duration: 9 Apr 201812 Apr 2018
Conference number: 31
http://arcs2018.itec.kit.edu/

Publication series

NameLecture notes in computer science
Volume10793

Conference

Conference31st International Conference on Architecture of Computing Systems 2018
Abbreviated titleARCS 2018
CountryGermany
CityBraunschweig
Period9/04/1812/04/18
Internet address

Fingerprint Dive into the research topics of 'A Flexible FPGA-based Inference Architecture for Pruned Deep Neural Networks'. Together they form a unique fingerprint.

  • Cite this

    Posewsky, T., & Ziener, D. (2018). A Flexible FPGA-based Inference Architecture for Pruned Deep Neural Networks. In M. Berekovic, R. Buchty, H. Hamann, D. Koch, & T. Pionteck (Eds.), Architecture of Computing Systems – ARCS 2018: 31st International Conference, Braunschweig, Germany, April 9–12, 2018, Proceedings (pp. 311-323). (Lecture notes in computer science; Vol. 10793). Braunschweig, Germany: Springer. https://doi.org/10.1007/978-3-319-77610-1_23