Notice
This is not the latest version of this item. The latest version can be found at:https://dspace.mit.edu/handle/1721.1/138277.2
Cloze Distillation: Improving Neural Language Models with Human Next-Word Prediction
Author(s)
Eisape, Tiwalayo; Zaslavsky, Noga; Levy, Roger
DownloadPublished version (468.0Kb)
Publisher with Creative Commons License
Publisher with Creative Commons License
Creative Commons Attribution
Terms of use
Metadata
Show full item recordAbstract
Contemporary autoregressive language models (LMs) trained purely on corpus data have
been shown to capture numerous features of
human incremental processing. However, past
work has also suggested dissociations between
corpus probabilities and human next-word predictions. Here we evaluate several state-of-theart language models for their match to human
next-word predictions and to reading time behavior from eye movements. We then propose
a novel method for distilling the linguistic information implicit in human linguistic predictions into pre-trained LMs: Cloze Distillation.
We apply this method to a baseline neural LM
and show potential improvement in reading
time prediction and generalization to held-out
human cloze data.
Date issued
2020Journal
Proceedings of the 24th Conference on Computational Natural Language Learning
Publisher
Association for Computational Linguistics (ACL)
Citation
Eisape, Tiwalayo, Zaslavsky, Noga and Levy, Roger. 2020. "Cloze Distillation: Improving Neural Language Models with Human Next-Word Prediction." Proceedings of the 24th Conference on Computational Natural Language Learning.
Version: Final published version