SeeSaw: Interactive Ad-hoc Search Over Image Databases

Moll, Oscar; Favela, Manuel; Madden, Samuel; Gadepally, Vijay; Cafarella, Michael

dc.contributor.author	Moll, Oscar
dc.contributor.author	Favela, Manuel
dc.contributor.author	Madden, Samuel
dc.contributor.author	Gadepally, Vijay
dc.contributor.author	Cafarella, Michael
dc.date.accessioned	2024-01-10T18:19:39Z
dc.date.available	2024-01-10T18:19:39Z
dc.date.issued	2023-12-12
dc.identifier.issn	2836-6573
dc.identifier.uri	https://hdl.handle.net/1721.1/153299
dc.description.abstract	As image datasets become ubiquitous, the problem of ad-hoc searches over image data is increasingly important. Many high level data tasks in machine learning, such as constructing datasets for training and testing object detectors, imply finding ad-hoc objects or scenes within large image datasets as a key sub-problem. New foundational visual-semantic embeddings trained on massive web datasets such as CLIP can help users start searches on their own data, but we find there is a long tail of queries where these models fall short in practice. SeeSaw is a system for interactive ad-hoc searches on image datasets that integrates state-of-the-art embeddings like CLIP with user feedback in the form of box annotations to help users quickly locate images of interest in their data even in the long-tail of harder queries. One key challenge for SeeSaw is that many sensible approaches to incorporating feedback into future results, including state of the art active-learning algorithms, can worsen results compared to introducing no feedback at all, partly due to CLIP’s high average performance. Therefore, SeeSaw employs several algorithms to transform user feedback into consistent improvements over CLIP alone. We compare SeeSaw’s accuracy to both using CLIP alone as well as to a state-of-the-art active-learning baseline and find SeeSaw consistently helps improve results for users across four datasets and more than a thousand queries. SeeSaw increases Average Precision (AP) on search tasks by an average of .08 on a wide benchmark (from a base of .72), and by a .27 on a subset of harder queries where CLIP alone performs poorly.	en_US
dc.publisher	ACM	en_US
dc.relation.isversionof	https://doi.org/10.1145/3626754	en_US
dc.rights	Creative Commons Attribution	en_US
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	en_US
dc.source	Association for Computing Machinery	en_US
dc.title	SeeSaw: Interactive Ad-hoc Search Over Image Databases	en_US
dc.type	Article	en_US
dc.identifier.citation	Moll, Oscar, Favela, Manuel, Madden, Samuel, Gadepally, Vijay and Cafarella, Michael. 2023. "SeeSaw: Interactive Ad-hoc Search Over Image Databases." Proceedings of the ACM on Management of Data, 1 (4 (SIGMOD)).
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.contributor.department	Lincoln Laboratory
dc.relation.journal	Proceedings of the ACM on Management of Data	en_US
dc.identifier.mitlicense	PUBLISHER_CC
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/JournalArticle	en_US
eprint.status	http://purl.org/eprint/status/PeerReviewed	en_US
dc.date.updated	2024-01-01T08:51:03Z
dc.language.rfc3066	en
dc.rights.holder	The author(s)
dspace.date.submission	2024-01-01T08:51:03Z
mit.journal.volume	1	en_US
mit.journal.issue	4 (SIGMOD)	en_US
mit.license	PUBLISHER_CC
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: license_rdf
Size:: 40bytes
Format:: application/rdf+xml

View/Open

Name:: 3626754.pdf
Size:: 1.673Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record