Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models

Wilcox, Ethan; Qian, Peng; Futrell, Richard; Kohita, Ryosuke; Levy, Roger; Ballesteros, Miguel

dc.contributor.author	Wilcox, Ethan
dc.contributor.author	Qian, Peng
dc.contributor.author	Futrell, Richard
dc.contributor.author	Kohita, Ryosuke
dc.contributor.author	Levy, Roger
dc.contributor.author	Ballesteros, Miguel
dc.date.accessioned	2021-12-01T17:46:55Z
dc.date.available	2021-12-01T17:46:55Z
dc.date.issued	2020
dc.identifier.uri	https://hdl.handle.net/1721.1/138280
dc.description.abstract	Humans can learn structural properties about a word from minimal experience, and deploy their learned syntactic representations uniformly in different grammatical contexts. We assess the ability of modern neural language models to reproduce this behavior in English and evaluate the effect of structural supervision on learning outcomes. First, we assess few-shot learning capabilities by developing controlled experiments that probe models’ syntactic nominal number and verbal argument structure generalizations for tokens seen as few as two times during training. Second, we assess invariance properties of learned representation: the ability of a model to transfer syntactic generalizations from a base context (e.g., a simple declarative active-voice sentence) to a transformed context (e.g., an interrogative sentence). We test four models trained on the same dataset: an n-gram baseline, an LSTM, and two LSTM-variants trained with explicit structural supervision (Dyer et al., 2016; Charniak et al., 2016). We find that in most cases, the neural models are able to induce the proper syntactic generalizations after minimal exposure, often from just two examples during training, and that the two structurally supervised models generalize more accurately than the LSTM model. All neural models are able to leverage information learned in base contexts to drive expectations in transformed contexts, indicating that they have learned some invariance properties of syntax.	en_US
dc.language.iso	en
dc.publisher	Association for Computational Linguistics (ACL)	en_US
dc.relation.isversionof	10.18653/V1/2020.EMNLP-MAIN.375	en_US
dc.rights	Creative Commons Attribution 4.0 International license	en_US
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	en_US
dc.source	Association for Computational Linguistics	en_US
dc.title	Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models	en_US
dc.type	Article	en_US
dc.identifier.citation	Wilcox, Ethan, Qian, Peng, Futrell, Richard, Kohita, Ryosuke, Levy, Roger et al. 2020. "Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models." Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
dc.contributor.department	Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences
dc.relation.journal	Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)	en_US
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dc.date.updated	2021-12-01T17:44:53Z
dspace.orderedauthors	Wilcox, E; Qian, P; Futrell, R; Kohita, R; Levy, R; Ballesteros, M	en_US
dspace.date.submission	2021-12-01T17:44:54Z
mit.license	PUBLISHER_CC
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: 2020.emnlp-main.375.pdf
Size:: 728.2Kb
Format:: PDF
Description:: Published version

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record