Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models
Author(s)
Wilcox, Ethan; Qian, Peng; Futrell, Richard; Kohita, Ryosuke; Levy, Roger; Ballesteros, Miguel; ... Show more Show less
DownloadPublished version (728.2Kb)
Publisher with Creative Commons License
Publisher with Creative Commons License
Creative Commons Attribution
Terms of use
Metadata
Show full item recordAbstract
Humans can learn structural properties about
a word from minimal experience, and deploy their learned syntactic representations
uniformly in different grammatical contexts.
We assess the ability of modern neural language models to reproduce this behavior in English and evaluate the effect of structural supervision on learning outcomes. First, we assess
few-shot learning capabilities by developing
controlled experiments that probe models’ syntactic nominal number and verbal argument
structure generalizations for tokens seen as
few as two times during training. Second, we
assess invariance properties of learned representation: the ability of a model to transfer syntactic generalizations from a base context (e.g.,
a simple declarative active-voice sentence) to a
transformed context (e.g., an interrogative sentence). We test four models trained on the
same dataset: an n-gram baseline, an LSTM,
and two LSTM-variants trained with explicit
structural supervision (Dyer et al., 2016; Charniak et al., 2016). We find that in most cases,
the neural models are able to induce the proper
syntactic generalizations after minimal exposure, often from just two examples during
training, and that the two structurally supervised models generalize more accurately than
the LSTM model. All neural models are able
to leverage information learned in base contexts to drive expectations in transformed contexts, indicating that they have learned some
invariance properties of syntax.
Date issued
2020Department
Massachusetts Institute of Technology. Department of Brain and Cognitive SciencesJournal
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Publisher
Association for Computational Linguistics (ACL)
Citation
Wilcox, Ethan, Qian, Peng, Futrell, Richard, Kohita, Ryosuke, Levy, Roger et al. 2020. "Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models." Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
Version: Final published version