Advancing Deep Learning Efficiency: From Specialized Co-Design to Automated Generation

Lin, Yujun

Author(s)

Lin, Yujun

DownloadThesis PDF (49.09Mb)

Advisor

Han, Song

Terms of use

In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

The explosive growth of artificial intelligence (AI) technologies, particularly large-scale deep learning models such as large language models and diffusion models, has intensified the demand for efficient full-stack inference solutions that effectively balance performance and costs. This work will present a comprehensive exploration into the algorithm-system co-optimization, hardware design specialization and automation for scalable AI deployment. First, we begin with algorithmic optimization for large-scale models, including large language models and diffusion models, developing inference libraries that leverage quantization to boost the performance of generative AIs on existing GPU platforms. Next, we design specialized hardware accelerators for domain-specific applications, specifically point cloud understanding, emphasizing efficiency improvements through the exploitation of data sparsity. Finally, we open up the hardware design space beyond template-based sizing, and progress into the automated learning-based co-design of neural network and hardware architectures, maximizing their synergy with a full-stack joint optimization. We then introduce an automated framework for spatial accelerator generation, transforming high-level mappings into custom hardware designs that support scalable deployment. Together, these contributions advance AI inference efficiency by bridging the gap between advanced computational requirements and hardware capabilities, between theoretical potential and practical solutions, and between design cost and effectiveness.

Date issued

2025-05

URI

https://hdl.handle.net/1721.1/164042

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Doctoral Theses