A Data-Based Perspective on Model Reliability

Jain, Saachi

dc.contributor.advisor	Mądry, Aleksander
dc.contributor.author	Jain, Saachi
dc.date.accessioned	2024-03-21T19:13:36Z
dc.date.available	2024-03-21T19:13:36Z
dc.date.issued	2024-02
dc.date.submitted	2024-02-21T17:18:47.875Z
dc.identifier.uri	https://hdl.handle.net/1721.1/153886
dc.description.abstract	Neural networks can fail to generalize to real world data — particularly on subpopulations that might have been mislabelled, corrupted, or underrepresented during training. In such settings, the set of features that a model relies on, or its feature prior, often determines the model’s ultimate reliability. While many factors contribute to a model’s feature prior, recent evidence indicates that the training dataset often plays a pivotal role. This thesis therefore aims to build the foundation for a data-centric perspective on model reliability, by uncovering how the training dataset’s composition affects the model’s feature prior, and thus the mistakes the model tends to make. It advances this objective through two main thrusts: developing scalable tools for identifying model failure modes in large datasets in large datasets and investigating the impact of pre-training data on the reliability of transfer learning models. In the first thrust, we develop techniques for uncovering meaningful patterns of model errors, especially in settings where manual exploration is prohibitively expensive. This includes building a framework for generating counterfactual images to debug model behavior as well as introducing a technique for automatically identifying failure modes by distilling them as directions in a latent space. We also propose a data-based approach to mitigate such failures at their source, by isolating training examples that drive a targeted bias. to mitigate such failures at their source, by isolating training examples that drive a targeted bias. In the second thrust, we investigate the role of the pre-training data in the transfer learning setting, where a pre-trained model is adapted to a downstream task. Here, we f irst explore the problem of “bias transfer”, where biases from the pre-trained model can persist even after adapting the model to the downstream task. We then introduce transfer influences, a framework for pinpointing the counterfactual impact of a pre-training datapoint on the final prediction. This framework enables us to isolate (and remove) detrimental points from the pre-training dataset to improve transfer learning performance.
dc.publisher	Massachusetts Institute of Technology
dc.rights	In Copyright - Educational Use Permitted
dc.rights	Copyright retained by author(s)
dc.rights.uri	https://rightsstatements.org/page/InC-EDU/1.0/
dc.title	A Data-Based Perspective on Model Reliability
dc.type	Thesis
dc.description.degree	Ph.D.
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degree	Doctoral
thesis.degree.name	Doctor of Philosophy

Files in this item

Name:: jain-saachij-phd-eecs-2024-the ...
Size:: 83.29Mb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record