MEng Thesis: Incorporating Structured Commonsense into Language Models

Yin, Claire

Author(s)

Yin, Claire

DownloadThesis PDF (1.531Mb)

Advisor

Katz, Boris

Lieberman, Henry

Terms of use

In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Machine learning has a wide variety of applications in the field of natural language processing (NLP). One such application is fine-tuning large pre-trained models to a wide variety of tasks. In this work, we propose methods to enhance these large language models by infusing them with information found in commonsense knowledge bases. Commonsense is basic knowledge about the world that humans are expected to have and is needed to achieve efficient communication. Often times, to understand texts, a person must use their commonsense to make implicit inferences based on what is explicitly presented in text. We harness the power of relational graph convolutional networks (RGCNs) to encode meaningful commonsense information from graphs and introduce 3 simple methods to inject this knowledge to improve contextual language representations from transformer-based language models. We show that the representations learned from the RGCN are useful in the task of link prediction in a commonsense knowledge base. Additionally, we show that the methods that we introduce to combine the representations of structured commonsense information with a transformer-based language model shows promising results in a downstream information retrieval task and in most types of combinations gives better performance than a baseline transformer-based language model. Lastly, we show that the representations learned from a RGCN, although trained on considerably less data, still prove useful in a downstream information retrieval task when combined with a transformer-based language model.

Date issued

2022-05

URI

https://hdl.handle.net/1721.1/145141

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses