| dc.contributor.author | Hu, Jennifer | |
| dc.contributor.author | Gauthier, Jon | |
| dc.contributor.author | Qian, Peng | |
| dc.contributor.author | Levy, Roger P | |
| dc.date.accessioned | 2021-04-07T16:02:37Z | |
| dc.date.available | 2021-04-07T16:02:37Z | |
| dc.date.issued | 2020-07 | |
| dc.identifier.uri | https://hdl.handle.net/1721.1/130402 | |
| dc.description.abstract | While state-of-the-art neural network models continue to achieve lower perplexity scores on language modeling benchmarks, it remains unknown whether optimizing for broad-coverage predictive performance leads to human-like syntactic knowledge. Furthermore, existing work has not provided a clear picture about the model properties required to produce proper syntactic generalizations. We present a systematic evaluation of the syntactic knowledge of neural language models, testing 20 combinations of model types and data sizes on a set of 34 English-language syntactic test suites. We find substantial differences in syntactic generalization performance by model architecture, with sequential models underperforming other architectures. Factorially manipulating model architecture and training dataset size (1M-40M words), we find that variability in syntactic generalization performance is substantially greater by architecture than by dataset size for the corpora tested in our experiments. Our results also reveal a dissociation between perplexity and syntactic generalization performance. | en_US |
| dc.description.sponsorship | National Institutes of Health (U.S.) (Award T32NS105587) | en_US |
| dc.language.iso | en | |
| dc.publisher | Association for Computational Linguistics (ACL) | en_US |
| dc.relation.isversionof | 10.18653/V1/2020.ACL-MAIN.158 | en_US |
| dc.rights | Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. | en_US |
| dc.source | Association for Computational Linguistics | en_US |
| dc.title | A Systematic Assessment of Syntactic Generalization in Neural Language Models | en_US |
| dc.type | Article | en_US |
| dc.identifier.citation | Hu, Jennifer et al. “A Systematic Assessment of Syntactic Generalization in Neural Language Models.” Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (July 2020): 1725–1744 © 2020 The Author(s) | en_US |
| dc.contributor.department | Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences | en_US |
| dc.relation.journal | Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics | en_US |
| dc.eprint.version | Final published version | en_US |
| dc.type.uri | http://purl.org/eprint/type/ConferencePaper | en_US |
| eprint.status | http://purl.org/eprint/status/NonPeerReviewed | en_US |
| dc.date.updated | 2021-04-07T14:41:25Z | |
| dspace.orderedauthors | Hu, J; Gauthier, J; Qian, P; Wilcox, E; Levy, R | en_US |
| dspace.date.submission | 2021-04-07T14:41:33Z | |
| mit.license | PUBLISHER_POLICY | |
| mit.metadata.status | Complete | |