MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Advancing Deep Learning Efficiency: From Specialized Co-Design to Automated Generation

Author(s)
Lin, Yujun
Thumbnail
DownloadThesis PDF (49.09Mb)
Advisor
Han, Song
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
The explosive growth of artificial intelligence (AI) technologies, particularly large-scale deep learning models such as large language models and diffusion models, has intensified the demand for efficient full-stack inference solutions that effectively balance performance and costs. This work will present a comprehensive exploration into the algorithm-system co-optimization, hardware design specialization and automation for scalable AI deployment. First, we begin with algorithmic optimization for large-scale models, including large language models and diffusion models, developing inference libraries that leverage quantization to boost the performance of generative AIs on existing GPU platforms. Next, we design specialized hardware accelerators for domain-specific applications, specifically point cloud understanding, emphasizing efficiency improvements through the exploitation of data sparsity. Finally, we open up the hardware design space beyond template-based sizing, and progress into the automated learning-based co-design of neural network and hardware architectures, maximizing their synergy with a full-stack joint optimization. We then introduce an automated framework for spatial accelerator generation, transforming high-level mappings into custom hardware designs that support scalable deployment. Together, these contributions advance AI inference efficiency by bridging the gap between advanced computational requirements and hardware capabilities, between theoretical potential and practical solutions, and between design cost and effectiveness.
Date issued
2025-05
URI
https://hdl.handle.net/1721.1/164042
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.