MEng Thesis: Incorporating Structured Commonsense into Language Models

Yin, Claire

dc.contributor.advisor	Katz, Boris
dc.contributor.advisor	Lieberman, Henry
dc.contributor.author	Yin, Claire
dc.date.accessioned	2022-08-29T16:36:05Z
dc.date.available	2022-08-29T16:36:05Z
dc.date.issued	2022-05
dc.date.submitted	2022-05-27T16:19:30.158Z
dc.identifier.uri	https://hdl.handle.net/1721.1/145141
dc.description.abstract	Machine learning has a wide variety of applications in the field of natural language processing (NLP). One such application is fine-tuning large pre-trained models to a wide variety of tasks. In this work, we propose methods to enhance these large language models by infusing them with information found in commonsense knowledge bases. Commonsense is basic knowledge about the world that humans are expected to have and is needed to achieve efficient communication. Often times, to understand texts, a person must use their commonsense to make implicit inferences based on what is explicitly presented in text. We harness the power of relational graph convolutional networks (RGCNs) to encode meaningful commonsense information from graphs and introduce 3 simple methods to inject this knowledge to improve contextual language representations from transformer-based language models. We show that the representations learned from the RGCN are useful in the task of link prediction in a commonsense knowledge base. Additionally, we show that the methods that we introduce to combine the representations of structured commonsense information with a transformer-based language model shows promising results in a downstream information retrieval task and in most types of combinations gives better performance than a baseline transformer-based language model. Lastly, we show that the representations learned from a RGCN, although trained on considerably less data, still prove useful in a downstream information retrieval task when combined with a transformer-based language model.
dc.publisher	Massachusetts Institute of Technology
dc.rights	In Copyright - Educational Use Permitted
dc.rights	Copyright MIT
dc.rights.uri	http://rightsstatements.org/page/InC-EDU/1.0/
dc.title	MEng Thesis: Incorporating Structured Commonsense into Language Models
dc.type	Thesis
dc.description.degree	M.Eng.
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degree	Master
thesis.degree.name	Master of Engineering in Electrical Engineering and Computer Science

Files in this item

Name:: Yin-yinc-meng-eecs-2022-thesis.pdf
Size:: 1.531Mb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record