This is SPATEM! A Spatial-Temporal Optimization Framework for Efficient Inference on ReRAM-based CNN Accelerator

Yen-Ting Tsou, Kuan-Hsun Chen, Chia-Lin Yang, Hsiang-Yun Cheng, Jian-Jia Chen, Der-Yu Tsai

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

Resistive memory-based computing-in-memory (CIM) has been considered as a promising solution to accelerate convolutional neural networks (CNN) inference, which stores the weights in crossbar memory arrays and performs in-situ matrix-vector multiplications (MVMs) in an analog manner. Several techniques assume that a whole crossbar can operate concurrently and discuss how to efficiently map the weights onto crossbar arrays. However, in practice, the accumulated effect of per-cell current deviation and Analog-to-Digital-Converter overhead may greatly degrade inference accuracy, which motivates the concept of Operation Unit (OU), by which an operation per cycle in a crossbar only involve limited wordlines and bitlines to preserve satisfactory inference accuracy. With OU-based operations, the mapping of weights and scheduling strategy for parallelizing CNN convolution operations should take the cost of communication overhead and resource utilization into consideration to optimize the inference acceleration. In this work, we propose the first optimization framework named SPATEM, that efficiently executes MVMs with OU-based operations on ReRAM-based CIM accelerators. It decouples the design space into tractable steps, models the expected inference latency, and derives an optimized spatial-temporal-aware scheduling strategy. By comparing with state-of-the-arts, the experimental result shows that the derived scheduling strategy of SPATEM achieves on average 29.24% inference latency reduction with 31.28% less communication overhead by exploiting more originally unused crossbar cells.
Original languageEnglish
Title of host publication2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)
PublisherIEEE/EUCA
Pages702-707
Number of pages6
ISBN (Electronic)9781665421355, 978-1-6654-2134-8
ISBN (Print)978-1-6654-2136-2
DOIs
Publication statusPublished - 2022
Event27th Asia and South Pacific Design Automation Conference, ASP-DAC 2022 - Taipei, Taiwan, Virtual Conference
Duration: 17 Jan 202220 Jan 2022
Conference number: 27
https://aspdac2022.github.io/

Conference

Conference27th Asia and South Pacific Design Automation Conference, ASP-DAC 2022
Abbreviated titleASP-DAC 2022
CityVirtual Conference
Period17/01/2220/01/22
Internet address

Keywords

  • Design automation
  • Costs
  • Convolution
  • Asia
  • Computer architecture
  • Data transfer
  • Common Information Model (computing)

Fingerprint

Dive into the research topics of 'This is SPATEM! A Spatial-Temporal Optimization Framework for Efficient Inference on ReRAM-based CNN Accelerator'. Together they form a unique fingerprint.

Cite this