Show simple item record

dc.contributor.authorMoll, Oscar
dc.contributor.authorFavela, Manuel
dc.contributor.authorMadden, Samuel
dc.contributor.authorGadepally, Vijay
dc.contributor.authorCafarella, Michael
dc.date.accessioned2024-01-10T18:19:39Z
dc.date.available2024-01-10T18:19:39Z
dc.date.issued2023-12-12
dc.identifier.issn2836-6573
dc.identifier.urihttps://hdl.handle.net/1721.1/153299
dc.description.abstractAs image datasets become ubiquitous, the problem of ad-hoc searches over image data is increasingly important. Many high level data tasks in machine learning, such as constructing datasets for training and testing object detectors, imply finding ad-hoc objects or scenes within large image datasets as a key sub-problem. New foundational visual-semantic embeddings trained on massive web datasets such as CLIP can help users start searches on their own data, but we find there is a long tail of queries where these models fall short in practice. SeeSaw is a system for interactive ad-hoc searches on image datasets that integrates state-of-the-art embeddings like CLIP with user feedback in the form of box annotations to help users quickly locate images of interest in their data even in the long-tail of harder queries. One key challenge for SeeSaw is that many sensible approaches to incorporating feedback into future results, including state of the art active-learning algorithms, can worsen results compared to introducing no feedback at all, partly due to CLIP’s high average performance. Therefore, SeeSaw employs several algorithms to transform user feedback into consistent improvements over CLIP alone. We compare SeeSaw’s accuracy to both using CLIP alone as well as to a state-of-the-art active-learning baseline and find SeeSaw consistently helps improve results for users across four datasets and more than a thousand queries. SeeSaw increases Average Precision (AP) on search tasks by an average of .08 on a wide benchmark (from a base of .72), and by a .27 on a subset of harder queries where CLIP alone performs poorly.en_US
dc.publisherACMen_US
dc.relation.isversionofhttps://doi.org/10.1145/3626754en_US
dc.rightsCreative Commons Attributionen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourceAssociation for Computing Machineryen_US
dc.titleSeeSaw: Interactive Ad-hoc Search Over Image Databasesen_US
dc.typeArticleen_US
dc.identifier.citationMoll, Oscar, Favela, Manuel, Madden, Samuel, Gadepally, Vijay and Cafarella, Michael. 2023. "SeeSaw: Interactive Ad-hoc Search Over Image Databases." Proceedings of the ACM on Management of Data, 1 (4 (SIGMOD)).
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.contributor.departmentLincoln Laboratory
dc.relation.journalProceedings of the ACM on Management of Dataen_US
dc.identifier.mitlicensePUBLISHER_CC
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2024-01-01T08:51:03Z
dc.language.rfc3066en
dc.rights.holderThe author(s)
dspace.date.submission2024-01-01T08:51:03Z
mit.journal.volume1en_US
mit.journal.issue4 (SIGMOD)en_US
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record