Show simple item record

dc.contributor.authorWilcox, Ethan
dc.contributor.authorQian, Peng
dc.contributor.authorFutrell, Richard
dc.contributor.authorKohita, Ryosuke
dc.contributor.authorLevy, Roger
dc.contributor.authorBallesteros, Miguel
dc.date.accessioned2021-12-01T17:46:55Z
dc.date.available2021-12-01T17:46:55Z
dc.date.issued2020
dc.identifier.urihttps://hdl.handle.net/1721.1/138280
dc.description.abstractHumans can learn structural properties about a word from minimal experience, and deploy their learned syntactic representations uniformly in different grammatical contexts. We assess the ability of modern neural language models to reproduce this behavior in English and evaluate the effect of structural supervision on learning outcomes. First, we assess few-shot learning capabilities by developing controlled experiments that probe models’ syntactic nominal number and verbal argument structure generalizations for tokens seen as few as two times during training. Second, we assess invariance properties of learned representation: the ability of a model to transfer syntactic generalizations from a base context (e.g., a simple declarative active-voice sentence) to a transformed context (e.g., an interrogative sentence). We test four models trained on the same dataset: an n-gram baseline, an LSTM, and two LSTM-variants trained with explicit structural supervision (Dyer et al., 2016; Charniak et al., 2016). We find that in most cases, the neural models are able to induce the proper syntactic generalizations after minimal exposure, often from just two examples during training, and that the two structurally supervised models generalize more accurately than the LSTM model. All neural models are able to leverage information learned in base contexts to drive expectations in transformed contexts, indicating that they have learned some invariance properties of syntax.en_US
dc.language.isoen
dc.publisherAssociation for Computational Linguistics (ACL)en_US
dc.relation.isversionof10.18653/V1/2020.EMNLP-MAIN.375en_US
dc.rightsCreative Commons Attribution 4.0 International licenseen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourceAssociation for Computational Linguisticsen_US
dc.titleStructural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Modelsen_US
dc.typeArticleen_US
dc.identifier.citationWilcox, Ethan, Qian, Peng, Futrell, Richard, Kohita, Ryosuke, Levy, Roger et al. 2020. "Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models." Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
dc.contributor.departmentMassachusetts Institute of Technology. Department of Brain and Cognitive Sciences
dc.relation.journalProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)en_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2021-12-01T17:44:53Z
dspace.orderedauthorsWilcox, E; Qian, P; Futrell, R; Kohita, R; Levy, R; Ballesteros, Men_US
dspace.date.submission2021-12-01T17:44:54Z
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record