MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Evaluating Data Augmentation with Attention Masks for Context Aware Transformations

Author(s)
Marquez, Sofia M.
Thumbnail
DownloadThesis PDF (317.4Kb)
Advisor
Murray, Fiona
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Transfer learning from large, pre-trained models and data augmentation are arguably the two most widespread solutions to the problem of data scarcity. However, both methods suffer from limitations that prevent more optimal solutions to natural language processing tasks. We consider that transfer learning benefits from fine-tuning on increased target dataset size, and that data augmentation benefits from applying transformations in a selective, rather than random, manner. Thus, this work evaluates a new augmentation paradigm that uses the attention masks of pre-trained transformers to more effectively apply text transformations in high-importance locations, creating augmentations which can be used for further finetuning. Our comprehensive analysis points to limited success of utilizing this context-aware augmentation method. By shedding light on its strengths and limitations, we offer insights that can guide the selection of optimal augmentation techniques for a variey of models, and lay groundwork for further research in the pursuit of effective solutions for natural language processing tasks under data constraints.
Date issued
2024-02
URI
https://hdl.handle.net/1721.1/155913
Department
Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.