MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Transformer Pruning Relation and General Neural Network Augmentation

Author(s)
Lim, Yong Hui
Thumbnail
DownloadThesis PDF (6.035Mb)
Advisor
Shavit, Nir
Terms of use
In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
In this thesis, a method of initializing neural networks with weights transferred from smaller trained neural network weights was investigated. We name this process augmentation and present a few versions of it, some of which involve pruning. Firstly, the pruning relation of testing loss against density was found for the GPT-2 transformer network on a causal language modeling task. An interesting double plateau of testing loss was found whenever the attention weights were pruned. Next, augmentation on low dimensional datasets and shallow networks was investigated. We found that performing a step of zeroing final layer initializations (ZFLI) results in better augmentation. With this insight, we proceeded to investigate a variety of datasets and networks. Two forms of augmentation were investigated: basic augmentation and pruned augmentation. However, both forms of augmentation were found to not produce any consistent improvement in testing accuracy/loss.
Date issued
2021-06
URI
https://hdl.handle.net/1721.1/139547
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.