Show simple item record

dc.contributor.advisorRobert C. Berwick.en_US
dc.contributor.authorVázquez Martínez, Héctor Javier.en_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2021-05-24T19:52:25Z
dc.date.available2021-05-24T19:52:25Z
dc.date.copyright2021en_US
dc.date.issued2021en_US
dc.identifier.urihttps://hdl.handle.net/1721.1/130703
dc.descriptionThesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February, 2021en_US
dc.descriptionCataloged from the official PDF of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 71-73).en_US
dc.description.abstractIn order to effectively assess Knowledge of Language (KoL) for any statistically-based Language Model (LM), one must develop a test that is first comprehensive in its coverage of linguistic phenomena; second backed by statistically-vetted human judgement data; and third, tests LMs' ability to track human gradient sentence acceptability judgements. Presently, most studies of KoL on LMs have focused on at most two of these three requirements at a time. This thesis takes steps toward a test of KoL that meets all three requirements by proposing the LI-Adger dataset: a comprehensive collection of 519 sentence types spanning the field of generative grammar, accompanied by attested and replicable human acceptability judgements for each of the 4177 sentences in the dataset, and complemented by the Acceptability Delta Criterion (ADC), an evaluation metric that enforces the gradience of acceptability by testing whether LMs can track the human data.en_US
dc.description.abstractTo validate this proposal, this thesis conducts a series of case studies with Bidirectional Encoder Representations from Transformers (Devlin et al. 2018). It first confirms the loss of statistical power caused by treating sentence acceptability as a categorical metric by benchmarking three BERT models fine-tuned using the Corpus of Linguistic Acceptability (CoLA; Warstadt & Bowman, 2019) on the comprehensive LI-Adger dataset. We find that although the BERT models achieve approximately 94% correct classification of the minimal pairs in the dataset, a trigram model trained using the British National Corpus by Sprouse et al. 2018, is able to perform similarly well (75%). Adopting the ADC immediately reveals that neither model is able to track the gradience of acceptability across minimal pairs: both BERT and the trigram model only score approximately 30% of the minimal pairs correctly.en_US
dc.description.abstractAdditionally, we demonstrate how the ADC rewards gradience by benchmarking the default BERT model using pseudo log-likelihood (PLL) scores, which raises its score to 38% correct prediction of all minimal pairs. This thesis thus identifies the need for an evaluation metric that tests KoL via gradient acceptability over the course of two case studies with BERT and proposes the ADC in response. We verify the effectiveness of the ADC using the LI-Adger dataset, a representative collection of 4177 sentences forming 2394 unique minimal pairs each backed by replicable and statistically powerful human judgement data. Taken together, this thesis proposes and provides the three necessary requirements for the comprehensive linguistic analysis and test of the Human KoL exhibited LMs that is currently missing in the field of Computational Linguistics.en_US
dc.description.statementofresponsibilityby Héctor Javier Vázquez Martínez.en_US
dc.format.extent73 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsMIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleThe acceptability delta criterion : memorization is not enoughen_US
dc.typeThesisen_US
dc.description.degreeM. Eng.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.identifier.oclc1251800552en_US
dc.description.collectionM.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Scienceen_US
dspace.imported2021-05-24T19:52:25Zen_US
mit.thesis.degreeMasteren_US
mit.thesis.departmentEECSen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record