Show simple item record

dc.contributor.advisorHan, Song
dc.contributor.authorLin, Yujun
dc.date.accessioned2025-11-25T19:38:20Z
dc.date.available2025-11-25T19:38:20Z
dc.date.issued2025-05
dc.date.submitted2025-08-14T19:40:52.254Z
dc.identifier.urihttps://hdl.handle.net/1721.1/164042
dc.description.abstractThe explosive growth of artificial intelligence (AI) technologies, particularly large-scale deep learning models such as large language models and diffusion models, has intensified the demand for efficient full-stack inference solutions that effectively balance performance and costs. This work will present a comprehensive exploration into the algorithm-system co-optimization, hardware design specialization and automation for scalable AI deployment. First, we begin with algorithmic optimization for large-scale models, including large language models and diffusion models, developing inference libraries that leverage quantization to boost the performance of generative AIs on existing GPU platforms. Next, we design specialized hardware accelerators for domain-specific applications, specifically point cloud understanding, emphasizing efficiency improvements through the exploitation of data sparsity. Finally, we open up the hardware design space beyond template-based sizing, and progress into the automated learning-based co-design of neural network and hardware architectures, maximizing their synergy with a full-stack joint optimization. We then introduce an automated framework for spatial accelerator generation, transforming high-level mappings into custom hardware designs that support scalable deployment. Together, these contributions advance AI inference efficiency by bridging the gap between advanced computational requirements and hardware capabilities, between theoretical potential and practical solutions, and between design cost and effectiveness.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright retained by author(s)
dc.rights.urihttps://rightsstatements.org/page/InC-EDU/1.0/
dc.titleAdvancing Deep Learning Efficiency: From Specialized Co-Design to Automated Generation
dc.typeThesis
dc.description.degreePh.D.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeDoctoral
thesis.degree.nameDoctor of Philosophy


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record