Show simple item record

dc.contributor.advisorAntonio Torralba.en_US
dc.contributor.authorHu, Jeffrey(Jeffrey H.)en_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2020-11-23T17:39:01Z
dc.date.available2020-11-23T17:39:01Z
dc.date.copyright2019en_US
dc.date.issued2019en_US
dc.identifier.urihttps://hdl.handle.net/1721.1/128567
dc.descriptionThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.en_US
dc.descriptionThesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, September, 2019en_US
dc.descriptionCataloged from student-submitted PDF of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 35-36).en_US
dc.description.abstractSegmentation datasets are smaller and much more expensive to collect than their image classification counterparts. Leveraging machine learning in the annotation process will be critical to scaling these datasets up. In this thesis, we propose an iterative cluster-based approach to segmentation data collection. By using existing networks to predict millions of segmentations and clustering to group similar predictions together, we ask human annotators a small number of questions per cluster and collect a large number of reasonable-quality segmentations at low cost. Although the collected segmentations are biased towards objects already predicted by the network, we demonstrate that they improve performance upon re-training and that the procedure can be applied iteratively, up to a point, to discover harder and harder objects. We demonstrate this pipeline in simulation and show promising results on real unlabeled images. We also present a new annotation tool called LabelMeLite for the rapid filtering and editing of predicted segmentations.en_US
dc.description.statementofresponsibilityby Jeffrey Hu.en_US
dc.format.extent36 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsMIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleClustering for large-scale segmentation dataset collectionen_US
dc.typeThesisen_US
dc.description.degreeM. Eng.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.identifier.oclc1220836817en_US
dc.description.collectionM.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Scienceen_US
dspace.imported2020-11-23T17:39:00Zen_US
mit.thesis.degreeMasteren_US
mit.thesis.departmentEECSen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record