Gradient Subgroup Scanning for Distributionally and Outlier Robust Models

Jung, Luann

Author(s)

Jung, Luann

DownloadThesis PDF (1.276Mb)

Advisor

Solomon, Justin

Yurochkin, Mikhail

Greenewald, Kristjan

Terms of use

In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Traditional machine learning methods such as empirical risk minimization (ERM) frequently encounter the issue of achieving high accuracy on average but low accuracy on certain subgroups, especially when there exist spurious correlations between the input data and label. Previous approaches for reducing the discrepancy between average and worst-group accuracies typically require expensive known subgroup annotations for either every training data point (as is the case in group distributionally robust optimization (DRO)), or every validation data point. Furthermore, these distributionally robust approaches tend to show reduced performance when outliers are also present in the data. Unfortunately, existing methods for improving subgroup performance cannot be simply combined with prior approaches for excluding outliers, as they often directly conflict. This work proposes a method for addressing both group robustness and outlier exclusion when training machine learning models that requires no previous knowledge about subgroups or outliers within the data. We focus on attempting to balance these two traditionally clashing goals by clustering the gradients of the losses with respect to the model parameters. In doing so, we find minority subgroups and exclude outliers under the assertion that gradients within groups behave similarly while outliers exhibit more randomized behavior. This work demonstrates an improvement to both average and worst-group accuracies compared to baselines and other previous methods when applied to the Waterbirds image dataset.

Date issued

2022-05

URI

https://hdl.handle.net/1721.1/145039

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses