Show simple item record

dc.contributor.advisorGuttag, John V.
dc.contributor.authorShanmugam, Divya
dc.date.accessioned2024-09-03T21:07:01Z
dc.date.available2024-09-03T21:07:01Z
dc.date.issued2024-05
dc.date.submitted2024-07-10T13:02:06.415Z
dc.identifier.urihttps://hdl.handle.net/1721.1/156553
dc.description.abstractThe data we have are often not the data we wish to use. This distinction can have serious consequences for the behavior of machine learning models across environments and demographic subgroups. If a disease is systematically underdiagnosed, machine learning models trained on this data risk replicating patterns of underdiagnosis. If the data used to evaluate machine learning models is not representative of data the models encounter during deployment, we risk missing model failures on subsets of the data distribution. If the demographics we use to assess the fairness of machine learning models are excessively coarse, we risk missing significant disparities in algorithmic performance. For domains in which f lawed data is common, these systematic differences represent a barrier to the widespread adoption of machine learning systems. In this thesis, we develop methods to encourage machine learning predictions to be reliable and equitable even when the underlying data are not. We approach this goal in three ways. We do so first by taking a data-centric lens, and developing methods to precisely characterize differences between the data we have and the data we wish to have (Chapters 2 & 3). We then adopt a model-centric lens to consider how one might efficiently update models without access to the training data (Chapters 4 & 5). Finally, we provide commentary on standard approaches to the use of race when evaluating machine learning systems (Chapter 6). In sum, this dissertation is a step towards machine learning methodology that is robust to the inevitably unreliable and inequitable data we are able to observe.
dc.publisherMassachusetts Institute of Technology
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
dc.rightsCopyright retained by author(s)
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.titleAdvancing Equity and Reliability in Machine Learning
dc.typeThesis
dc.description.degreePh.D.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeDoctoral
thesis.degree.nameDoctor of Philosophy


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record