Show simple item record

dc.contributor.advisorKatz, Boris
dc.contributor.authorSleeper, Dylan
dc.date.accessioned2022-06-15T13:11:29Z
dc.date.available2022-06-15T13:11:29Z
dc.date.issued2022-02
dc.date.submitted2022-02-22T18:32:10.921Z
dc.identifier.urihttps://hdl.handle.net/1721.1/143309
dc.description.abstractIn this work, we collect a new human annotated dataset called Grounded SCAN Human (gSCAN Human) as an extension of the original Grounded SCAN (gSCAN) dataset. The original gSCAN dataset was created to test various compositional generalizations by holding out certain examples during train time. During test time, models must zero-shot execute commands that require the agent to move in new directions, commands that contain novel combinations of objects and adjectives, and other such generalizations in different test sets called splits. However, gSCAN does not contain splits that test zero-shot generalizations to new sentence structures and a whole new vocabulary. The gSCAN Human dataset was created to test these generalizations: can a model trained using a simple grammar generalize to human annotations? We collected and verified a total of 1, 391 human annotations across all of the gSCAN splits (excluding the test and dev split) and evaluated various models on each of the splits. We test the original gSCAN baseline with several modifications, including the baseline with a transformer replacing the encoder, and one with early multimodal fusion of the sentence encoding with the visual embedding. We also test a multimodal transformer similar to VilBERT, which is the state of the art on the original gSCAN splits. We find that the models are somewhat robust to varying sentence structure and new vocabulary; however the models are far less successful given a combination of the two, as evaluated by the human data.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright MIT
dc.rights.urihttp://rightsstatements.org/page/InC-EDU/1.0/
dc.titleGrounded SCAN Human: A Benchmark for Zero-Shot Generalizations
dc.typeThesis
dc.description.degreeM.Eng.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeMaster
thesis.degree.nameMaster of Engineering in Electrical Engineering and Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record