Show simple item record

dc.contributor.advisorChandrakasan, Anantha P.
dc.contributor.authorJi, Alex
dc.date.accessioned2023-11-02T20:22:24Z
dc.date.available2023-11-02T20:22:24Z
dc.date.issued2023-09
dc.date.submitted2023-09-21T14:26:20.101Z
dc.identifier.urihttps://hdl.handle.net/1721.1/152854
dc.description.abstractMachine learning inference on edge devices for image and language processing has become increasingly common in recent years, but faces challenges associated with high memory and computation requirements, coupled with limited energy resources. This work applies different quantization schemes and training techniques to reduce the cost of running these models and provide flexibility in the hardware. Energy scalability is achieved through bit width scaling, as well as model size scaling. These techniques are applied to three neural network accelerators, which have been taped out and tested, to enable efficient inference for a variety of applications. The first chip is a CNN accelerator that simplifies computation using nonlinearly quantized weights by reordering multiplication and accumulation. This modified computation requires additional storage elements compared to a conventional approach. To minimize the area overhead, a custom accumulator array layout is designed. The second chip targets moderately-sized Transformer models (e.g. ALBERT) using piecewise-linear quantization (PWLQ) for both weights and activations. Lastly, an energy-adaptive accelerator for natural language understanding based on lightweight Transformer models is presented. The model size can by adjusted by sampling the weights of the full model to obtain differently sized submodels, without the memory overhead of storing multiple models.
dc.publisherMassachusetts Institute of Technology
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
dc.rightsCopyright retained by author(s)
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.titleFlexible Energy-Aware Image and Transformer Processors for Edge Computing
dc.typeThesis
dc.description.degreePh.D.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.orcidhttps://orcid.org/0009-0000-7720-9951
mit.thesis.degreeDoctoral
thesis.degree.nameDoctor of Philosophy


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record