MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Last Layer Retraining of Selectively Sampled Wild Data Improves Performance

Author(s)
Yang, Hao Bang
Thumbnail
DownloadThesis PDF (1.360Mb)
Advisor
Solomon, Justin
Yurochkin, Mikhail
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
While AI models perform well in labs where training and testing data are in a similar domain, they experience significant drops in performance in the wild where the data can lie in domains outside the training distribution. Out-of-distribution (OOD) generalization is difficult because these domains are underrepresented or non-existent in training data. The pursuit of a solution to bridging the performance gap between in-distribution and out-of-distribution data has led to the development of various generalization algorithms that target finding invariant/"good" features. Recent results have highlighted the possibility of poorly generalized classification layers as the main contributor to the performance difference while the featurizer is already able to produce sufficiently good features. This thesis will verify this possibility over a combination of datasets, generalization algorithms, and training methods for the classifier. We show that we can improve the OOD performance significantly compared to the original models when evaluated in natural OOD domains by simply retraining a new classification layer using a small number of labeled examples. We further study methods for efficient selection of labeled OOD examples to train the classifier by utilizing clustering techniques on featurized unlabeled OOD data.
Date issued
2023-06
URI
https://hdl.handle.net/1721.1/151358
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.