MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

When Should Model Updates Propagate?

Author(s)
Struckman, Isabella Marguerite
Thumbnail
DownloadThesis PDF (11.09Mb)
Advisor
Mądry, Aleksander
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
AI supply chains rely increasingly on downstream developers adapting pretrained upstream models. When upstream models are retrained with data deletions (which may be prompted by copyright violations, privacy compliance, or removal of illicit content), it’s unclear if all downstream developers must also undergo costly retraining. In this thesis, we investigate the propagation of data deletions through fine-tuned models within a controlled visual classification setting comprising dog-breed and plane-manufacturer recognition tasks. We show that not all model updates propagate equivalently to downstream tasks, and there is a strong relationship between the deleted data’s relationship to the downstream task and its affect on the downstream model. We demonstrate that neither simple performance metrics (accuracy or F1), nor output-level divergences, nor even embedding-based similarity metrics alone adequately predict when a deletion meaningfully impacts downstream tasks. To overcome these limitations, we introduce an information-theoretic metric grounded in Gaussian mixture modeling (GMM) of embedding distributions, capturing deeper representational shifts. Our proposed approach precisely distinguishes when deletions require downstream retraining, achieving high predictive accuracy and recall without directly accessing retrained downstream models.
Date issued
2025-05
URI
https://hdl.handle.net/1721.1/162918
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.