Efficient, Accurate, and Flexible PIM Inference through Adaptable Low-Resolution Arithmetic

Andrulis, Tanner

dc.contributor.advisor	Emer, Joel S.
dc.contributor.advisor	Sze, Vivienne
dc.contributor.author	Andrulis, Tanner
dc.date.accessioned	2023-07-31T19:41:27Z
dc.date.available	2023-07-31T19:41:27Z
dc.date.issued	2023-06
dc.date.submitted	2023-07-13T14:13:58.566Z
dc.identifier.uri	https://hdl.handle.net/1721.1/151461
dc.description.abstract	Processing-In-Memory (PIM) accelerators have the potential to efficiently run Deep Neural Network (DNN) inference by reducing costly data movement and by using resistive RAM (ReRAM) for efficient analog compute. Unfortunately, overall PIM accelerator efficiency and throughput are limited by area/energy-intensive analog-to-digital converters (ADCs). Furthermore, existing accelerators that reduce ADC area/energy do so by changing DNN weights or by using low-resolution ADCs that reduce output fidelity. These approaches harm DNN accuracy and/or require costly DNN retraining to compensate. To address these issues, this thesis explores tradeoffs around ADC area/energy and develops optimizations that can reduce ADC area/energy without retraining DNNs. We use these optimizations to develop a new PIM accelerator, RAELLA, which can adapt the architecture to each DNN. RAELLA lowers the resolution of computed analog values by encoding weights to produce near-zero analog values, adaptively slicing weights for each DNN layer, and dynamically slicing inputs through speculation and recovery. Low-resolution analog values allow RAELLA to both use efficient low-resolution ADCs and maintain accuracy without retraining, all while computing with fewer ADC converts. Compared to other low-accuracy-loss PIM accelerators, RAELLA increases energy efficiency by up to 4.9x and throughput by up to 3.3x. Compared to PIM accelerators that cause accuracy loss and retrain DNNs to recover, RAELLA achieves similar efficiency and throughput without expensive DNN retraining.
dc.publisher	Massachusetts Institute of Technology
dc.rights	In Copyright - Educational Use Permitted
dc.rights	Copyright retained by author(s)
dc.rights.uri	https://rightsstatements.org/page/InC-EDU/1.0/
dc.title	Efficient, Accurate, and Flexible PIM Inference through Adaptable Low-Resolution Arithmetic
dc.type	Thesis
dc.description.degree	S.M.
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.orcid	0000-0002-3168-9862
mit.thesis.degree	Master
thesis.degree.name	Master of Science in Electrical Engineering and Computer Science

Files in this item

Name:: andrulis-andrulis-sm-eecs-2023 ...
Size:: 2.566Mb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record