An Energy Efficient Soft SIMD Microarchitecture and Its Application on Quantized CNNs

Yu, Pengbo; Ponzina, Flavio; Levisse, Alexandre Sébastien Julien; Gupta Mohit; Biswas Dwaipayan; Ansaloni, Giovanni; Atienza Alonso, David; Catthoor Francky

doi:10.1109/TVLSI.2024.3375793

Yu, Pengbo; Ponzina, Flavio; Levisse, Alexandre Sébastien Julien; Gupta Mohit ; Biswas Dwaipayan ; Ansaloni, Giovanni; Atienza Alonso, David; Catthoor Francky

2024

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

The ever-increasing computational complexity and energy consumption of today's applications, such as Machine Learning (ML) algorithms, not only strain the capabilities of the underlying hardware but also significantly restrict their wide deployment at the edge. Addressing these challenges, novel architecture solutions are required by leveraging opportunities exposed by algorithms, e.g., robustness to small-bitwidth operand quantization and high intrinsic data-level parallelism. However, traditional Hardware Single Instruction Multiple Data (Hard SIMD) architectures only support a small set of operand bitwidths, limiting performance improvement. To fill the gap, this manuscript introduces a novel pipelined processor microarchitecture for arithmetic computing based on the Software-defined SIMD (Soft SIMD) paradigm that can define arbitrary SIMD modes through control instructions at run-time. This microarchitecture is optimized for parallel fine-grained fixed-point arithmetic, such as shift/add. It can also efficiently execute sequential shift-add-based multiplication over SIMD subwords, thanks to zero-skipping and Canonical Signed Digit (CSD) coding. A lightweight repacking unit allows changing subword bitwidth dynamically. These features are implemented within a tight energy and area budget. An energy consumption model is established through post-synthesis for performance assessment. We select heterogeneously quantized Convolutional Neural Networks (CNNs) from the ML domain as the benchmark and map it onto our microarchitecture. Experimental results showcase that our approach dramatically outperforms traditional Hard SIMD Multiplier-Adder regarding area and energy requirements. In particular, our microarchitecture occupies up to 59.9% less area than a Hard SIMD that supports fewer SIMD bitwidths, while consuming up to 50.1% less energy on average to execute heterogeneously quantized CNNs.

Details

Title An Energy Efficient Soft SIMD Microarchitecture and Its Application on Quantized CNNs

Author(s) Yu, Pengbo ; Ponzina, Flavio ; Levisse, Alexandre Sébastien Julien ; Gupta Mohit ; Biswas Dwaipayan ; Ansaloni, Giovanni ; Atienza Alonso, David ; Catthoor Francky

Published in IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Date 2024-03-05

Keywords

Energy Efficient Computing; Software-defined Single Instruction Multiple Data; Data-level Parallelism; Canonic Signed Digit Coding; Heterogeneously Quantized Convolutional Neural Networks

DOI https://doi.org/10.1109/TVLSI.2024.3375793

Laboratories ESL

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > ESL - Embedded Systems Laboratory
Peer-reviewed publications
Work produced at EPFL
Journal Articles
Accepted

Grant H2020: 101016776
Other foundations: ACCESS – AI Chip Center for Emerging Smart Systems, sponsored by InnoHK funding, Hong Kong SAR
Other foundations: joint research grant for ESL-EPFL by IMEC

Record creation date 2024-03-07

Files

Abstract

Details

PDF