Transfer learning and robustness for natural language processing

Jin, Di,Ph.D.Massachusetts Institute of Technology.

dc.contributor.advisor	Peter Szolovits.	en_US
dc.contributor.author	Jin, Di,Ph.D.Massachusetts Institute of Technology.	en_US
dc.contributor.other	Massachusetts Institute of Technology. Department of Mechanical Engineering.	en_US
dc.date.accessioned	2021-01-05T23:12:27Z
dc.date.available	2021-01-05T23:12:27Z
dc.date.copyright	2020	en_US
dc.date.issued	2020	en_US
dc.identifier.uri	https://hdl.handle.net/1721.1/129004
dc.description	Thesis: Ph. D., Massachusetts Institute of Technology, Department of Mechanical Engineering, 2020	en_US
dc.description	Cataloged from student-submitted PDF of thesis.	en_US
dc.description	Includes bibliographical references (pages 189-217).	en_US
dc.description.abstract	Teaching machines to understand human language is one of the most elusive and long-standing challenges in Natural Language Processing (NLP). Driven by the fast development of deep learning, state-of-the-art NLP models have already achieved human-level performance in various large benchmark datasets, such as SQuAD, SNLI, and RACE. However, when these strong models are deployed to real-world applications, they often show poor generalization capability in two situations: 1. There is only a limited amount of data available for model training; 2. Deployed models may degrade significantly in performance on noisy test data or natural/artificial adversaries. In short, performance degradation on low-resource tasks/datasets and unseen data with distribution shifts imposes great challenges to the reliability of NLP models and prevent them from being massively applied in the wild. This dissertation aims to address these two issues.	en_US
dc.description.abstract	Towards the first one, we resort to transfer learning to leverage knowledge acquired from related data in order to improve performance on a target low-resource task/dataset. Specifically, we propose different transfer learning methods for three natural language understanding tasks: multi-choice question answering, dialogue state tracking, and sequence labeling, and one natural language generation task: machine translation. These methods are based on four basic transfer learning modalities: multi-task learning, sequential transfer learning, domain adaptation, and cross-lingual transfer. We show experimental results to validate that transferring knowledge from related domains, tasks, and languages can improve the target task/dataset significantly. For the second issue, we propose methods to evaluate the robustness of NLP models on text classification and entailment tasks.	en_US
dc.description.abstract	On one hand, we reveal that although these models can achieve a high accuracy of over 90%, they still easily crash over paraphrases of original samples by changing only around 10% words to their synonyms. On the other hand, by creating a new challenge set using four adversarial strategies, we find even the best models for the aspect-based sentiment analysis task cannot reliably identify the target aspect and recognize its sentiment accordingly. On the contrary, they are easily confused by distractor aspects. Overall, these findings raise great concerns of robustness of NLP models, which should be enhanced to ensure their long-run stable service.	en_US
dc.description.statementofresponsibility	by Di Jin.	en_US
dc.format.extent	217 pages	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Mechanical Engineering.	en_US
dc.title	Transfer learning and robustness for natural language processing	en_US
dc.type	Thesis	en_US
dc.description.degree	Ph. D.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Mechanical Engineering	en_US
dc.identifier.oclc	1227042422	en_US
dc.description.collection	Ph.D. Massachusetts Institute of Technology, Department of Mechanical Engineering	en_US
dspace.imported	2021-01-05T23:12:26Z	en_US
mit.thesis.degree	Doctoral	en_US
mit.thesis.department	MechE	en_US

Files in this item

Name:: 1227042422-MIT.pdf
Size:: 4.258Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record