MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Gradient Subgroup Scanning for Distributionally and Outlier Robust Models

Author(s)
Jung, Luann
Thumbnail
DownloadThesis PDF (1.276Mb)
Advisor
Solomon, Justin
Yurochkin, Mikhail
Greenewald, Kristjan
Terms of use
In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Traditional machine learning methods such as empirical risk minimization (ERM) frequently encounter the issue of achieving high accuracy on average but low accuracy on certain subgroups, especially when there exist spurious correlations between the input data and label. Previous approaches for reducing the discrepancy between average and worst-group accuracies typically require expensive known subgroup annotations for either every training data point (as is the case in group distributionally robust optimization (DRO)), or every validation data point. Furthermore, these distributionally robust approaches tend to show reduced performance when outliers are also present in the data. Unfortunately, existing methods for improving subgroup performance cannot be simply combined with prior approaches for excluding outliers, as they often directly conflict. This work proposes a method for addressing both group robustness and outlier exclusion when training machine learning models that requires no previous knowledge about subgroups or outliers within the data. We focus on attempting to balance these two traditionally clashing goals by clustering the gradients of the losses with respect to the model parameters. In doing so, we find minority subgroups and exclude outliers under the assertion that gradients within groups behave similarly while outliers exhibit more randomized behavior. This work demonstrates an improvement to both average and worst-group accuracies compared to baselines and other previous methods when applied to the Waterbirds image dataset.
Date issued
2022-05
URI
https://hdl.handle.net/1721.1/145039
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.