Probing Language Models for Contextual ScaleUnderstanding
Author(s)
Vedantam, Saaketh
DownloadThesis PDF (1.558Mb)
Advisor
Kim, Yoon
Terms of use
Metadata
Show full item recordAbstract
Pretrained language models (LMs) have demonstrated a remarkable ability to emit linguistic and factual knowledge in certain fields. Additionally, they seem to encode relational information about different concepts in a knowledge base. However, since they are trained solely on textual corpora, it is unclear whether these models implicitly understand anything grounded about the real world. This work investigates the extent to which LMs learn the structure of the physical world. By probing the contextualized embeddings of sentences, we examine how well LMs predict the sizes of real-world objects. We further explore the effect of adjectival modifiers on object embeddings. We show that while larger models more accurately convey scalar information through their embeddings, they perform on par with smaller models in the task of contextual prediction. Fortunately, the models are capable of identifying a difference in scale when an adjectival modifier is introduced, implying that the relevant context is successfully incorporated into the object’s embedding through the LM’s attention mechanism.
Date issued
2023-06Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology