Show simple item record

dc.contributor.advisorGlass, James R.
dc.contributor.authorPalmer, Ian A.
dc.date.accessioned2022-01-14T14:45:39Z
dc.date.available2022-01-14T14:45:39Z
dc.date.issued2021-06
dc.date.submitted2021-06-17T20:13:59.319Z
dc.identifier.urihttps://hdl.handle.net/1721.1/139030
dc.description.abstractVisually-grounded spoken language datasets can enable models to learn cross-modal correspondences with very weak supervision. However, modern audio-visual datasets contain biases that undermine the real-world performance of models trained on that data. We introduce Spoken ObjectNet, which is designed to remove some of these biases and provide a way to better evaluate how effectively models will perform in real-world scenarios. This dataset expands upon ObjectNet, which is a large-scale image dataset that features controls for biases encoded into many other common image datasets. We detail our data collection pipeline, which features several methods to improve caption quality, including automated language model checks. We also present an analysis of the vocabulary of our collected captions. Lastly, we show baseline results on several audio-visual machine learning tasks, including retrieval and machine captioning. These results show that models trained on other datasets and then evaluated on Spoken ObjectNet tend to perform poorly due to biases in other datasets that the models have learned. We also show evidence that the performance decrease is due to the dataset controls, and not the transfer setting. We intend to make our dataset openly available to the general public to encourage new lines of work in training models that are better equipped to operate in the real world.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright MIT
dc.rights.urihttp://rightsstatements.org/page/InC-EDU/1.0/
dc.titleSpoken ObjectNet: Creating a Bias-Controlled Spoken Caption Dataset
dc.typeThesis
dc.description.degreeM.Eng.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeMaster
thesis.degree.nameMaster of Engineering in Electrical Engineering and Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record