MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Large-Scale Optical Hardware for Neural Network Inference Acceleration

Author(s)
Bernstein, Liane
Thumbnail
DownloadThesis PDF (6.346Mb)
Advisor
Englund, Dirk R.
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Artificial deep neural networks (DNNs) have revolutionized tasks such as automated classification and natural language processing. To boost accuracy and handle more complex workloads, DNN model sizes have grown exponentially over the last decade, outpacing improvements in digital electronic microprocessor efficiency. This mismatch limits DNN performance and contributes to soaring data center energy costs. Optical hardware for deep learning (optical neural networks, or ONNs) can theoretically increase DNN processing efficiency; however, the feasibility of large-scale, fully programmable and reconfigurable ONNs has not yet been comprehensively shown in experiments. This thesis reports our demonstrations of ONNs that classify ~1000-element input vectors using standard DNN layers in inference without hardware modeling or retraining. In a first project, we used digital optical links to replace copper wires for transmitting and copying data to electronic multipliers. Our experimental implementation showed an MNIST classification accuracy within <0.6% of the digital electronic ground truth. We estimated that this 'digital ONN' could reduce energy consumption for long data transfer lengths, but not in tightly packed electronic multiplier arrays. Therefore, in a second project, we expanded upon this work by performing reconfigurable optical multicast and analog optoelectronic weighting to compute DNN layer outputs in a single shot. Our proof-of-concept system yielded an MNIST classification accuracy of 96.7% (boosted to 97.3% with weight fine-tuning) with respect to the ground-truth accuracy of 97.9%. We calculated that a near-term optimized version of this system could lower energy consumption and latency by 1-2 orders of magnitude compared to a state-of-the-art digital electronic systolic array. These findings suggest a paradigm shift towards optoelectronic DNN accelerators with lower resource utilization that could enable the next generation of deep learning.
Date issued
2024-02
URI
https://hdl.handle.net/1721.1/153830
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.