Probing Language Models for Contextual ScaleUnderstanding

Vedantam, Saaketh

Author(s)

Vedantam, Saaketh

DownloadThesis PDF (1.558Mb)

Advisor

Kim, Yoon

Terms of use

In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Pretrained language models (LMs) have demonstrated a remarkable ability to emit linguistic and factual knowledge in certain fields. Additionally, they seem to encode relational information about different concepts in a knowledge base. However, since they are trained solely on textual corpora, it is unclear whether these models implicitly understand anything grounded about the real world. This work investigates the extent to which LMs learn the structure of the physical world. By probing the contextualized embeddings of sentences, we examine how well LMs predict the sizes of real-world objects. We further explore the effect of adjectival modifiers on object embeddings. We show that while larger models more accurately convey scalar information through their embeddings, they perform on par with smaller models in the task of contextual prediction. Fortunately, the models are capable of identifying a difference in scale when an adjectival modifier is introduced, implying that the relevant context is successfully incorporated into the object’s embedding through the LM’s attention mechanism.

Date issued

2023-06

URI

https://hdl.handle.net/1721.1/151617

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses