Show simple item record

dc.contributor.advisorKatz, Boris
dc.contributor.advisorLieberman, Henry
dc.contributor.authorYin, Claire
dc.date.accessioned2022-08-29T16:36:05Z
dc.date.available2022-08-29T16:36:05Z
dc.date.issued2022-05
dc.date.submitted2022-05-27T16:19:30.158Z
dc.identifier.urihttps://hdl.handle.net/1721.1/145141
dc.description.abstractMachine learning has a wide variety of applications in the field of natural language processing (NLP). One such application is fine-tuning large pre-trained models to a wide variety of tasks. In this work, we propose methods to enhance these large language models by infusing them with information found in commonsense knowledge bases. Commonsense is basic knowledge about the world that humans are expected to have and is needed to achieve efficient communication. Often times, to understand texts, a person must use their commonsense to make implicit inferences based on what is explicitly presented in text. We harness the power of relational graph convolutional networks (RGCNs) to encode meaningful commonsense information from graphs and introduce 3 simple methods to inject this knowledge to improve contextual language representations from transformer-based language models. We show that the representations learned from the RGCN are useful in the task of link prediction in a commonsense knowledge base. Additionally, we show that the methods that we introduce to combine the representations of structured commonsense information with a transformer-based language model shows promising results in a downstream information retrieval task and in most types of combinations gives better performance than a baseline transformer-based language model. Lastly, we show that the representations learned from a RGCN, although trained on considerably less data, still prove useful in a downstream information retrieval task when combined with a transformer-based language model.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright MIT
dc.rights.urihttp://rightsstatements.org/page/InC-EDU/1.0/
dc.titleMEng Thesis: Incorporating Structured Commonsense into Language Models
dc.typeThesis
dc.description.degreeM.Eng.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeMaster
thesis.degree.nameMaster of Engineering in Electrical Engineering and Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record