Dataset Interfaces: Diagnosing Model Failures Using Controllable Counterfactual Generation

Vendrow, Joshua L.

dc.contributor.advisor	Mądry, Aleksander
dc.contributor.author	Vendrow, Joshua L.
dc.date.accessioned	2024-08-21T18:55:14Z
dc.date.available	2024-08-21T18:55:14Z
dc.date.issued	2024-05
dc.date.submitted	2024-07-10T13:00:00.670Z
dc.identifier.uri	https://hdl.handle.net/1721.1/156303
dc.description.abstract	Distribution shift is a major source of failure for machine learning models. However, evaluating model reliability under distribution shift can be challenging, especially since it may be difficult to acquire counterfactual examples that exhibit a specified shift. In this work, we introduce the notion of a dataset interface: a framework that, given an input dataset and a user-specified shift, returns instances from that input distribution that exhibit the desired shift. We study a number of natural implementations for such an interface, and find that they often introduce confounding shifts that complicate model evaluation. Motivated by this, we propose a dataset interface implementation that leverages Textual Inversion to tailor generation to the input distribution. We then demonstrate how applying this dataset interface to the ImageNet dataset enables studying model behavior across a diverse array of distribution shifts, including variations in background, lighting, and attributes of the objects.
dc.publisher	Massachusetts Institute of Technology
dc.rights	In Copyright - Educational Use Permitted
dc.rights	Copyright retained by author(s)
dc.rights.uri	https://rightsstatements.org/page/InC-EDU/1.0/
dc.title	Dataset Interfaces: Diagnosing Model Failures Using Controllable Counterfactual Generation
dc.type	Thesis
dc.description.degree	S.M.
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degree	Master
thesis.degree.name	Master of Science in Electrical Engineering and Computer Science

Files in this item

Name:: vendrow-jvendrow-sm-eecs-2024- ...
Size:: 44.71Mb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record