Hardware-aware efficient deep neural network design

Yang, Tien-Ju.

Author(s)

Yang, Tien-Ju.

Download1227782227-MIT.pdf (43.13Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

Vivienne Sze.

Terms of use

MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Deep neural networks (DNNs) deliver best-in-class accuracy on various artificial intelligence applications. However, the high accuracy comes at the cost that the computational complexity of DNNs is much higher than that of conventional methods. The resultant low efficiency leads to high carbon emissions, high financial cost, and hinders the deployment of DNNs on mobile devices. Although many methods have been proposed to improve DNN efficiency, most of them focus on optimizing proxy metrics, such as the number of weights and operations. Because these proxy metrics do not reflect the hardware properties, the improvement in proxy metrics does not necessarily translate to improved hardware metrics, such as lower latency and energy consumption, which are of the utmost importance in practice. In this thesis, we present how to properly bring hardware into the loop while designing DNNs to address the problems mentioned above.

We extensively study this research topic from different perspectives and propose comprehensive solutions that realize state-of-the-art efficient DNNs across different hardware platforms, applications, and use cases. We first propose three automated DNN design algorithms that directly optimize hardware metrics to push the frontier of efficient DNNs. Because evaluating hardware metrics directly on hardware devices can be slow, we then propose two fast methods for estimating hardware metrics to speed up the hardware-aware DNN design process for most of the use cases and make hardware metrics more accessible. Moreover, existing design approaches are mostly designed for digital accelerators and image classification, but different hardware and applications face different challenges due to their specific hardware properties and constraints.

In view of this, we also explore designing efficient DNNs for a broad range of hardware and applications to demonstrate how hardware properties and constraints change the design approaches and propose corresponding solutions.

Description

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, September, 2020

Cataloged from student-submitted PDF of thesis.

Includes bibliographical references (pages 191-217).

Date issued

2020

URI

https://hdl.handle.net/1721.1/129314

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Doctoral Theses