MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Neural Data Shaping and Evaluation via Mutual Information Estimation

Author(s)
Wu, William
Thumbnail
DownloadThesis PDF (2.532Mb)
Advisor
Médard, Muriel
Terms of use
In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Machine learning in sensitive domains like healthcare currently faces a major bottleneck due to the scarcity of data that is publicly available. Privacy protection regulations such as HIPAA and GDPR and recent progress in information estimation literature motivate us to investigate the issue from an information theoretic perspective. In this thesis, we propose InfoShape, an encoder training scheme that aims to maintain privacy while also preserving utility for downstream prediction tasks. We achieve this by utilizing mutual information neural estimation (MINE) [2] to estimate two quantities, privacy leakage: the mutual information between the original inputs and the encoded representations, and utility score: the mutual information between the encoded representations and the intended labeling information for classification. We train a neural network as our encoder by using our privacy and utility measures in a Lagrangian optimization. We show empirically on Gaussian generated data that InfoShape is capable of altering encoded sample outputs such that the privacy leakage is reduced and the utility score increases. Moreover, we observe that the classification accuracy of downstream models has a meaningful connection with the utility score, which improves after we train an encoder compared to the untrained encoder. This work has profound implications for privacy-preserving machine learning and could serve as a pivotal tool in the future for revolutionizing AI in areas like healthcare.
Date issued
2022-09
URI
https://hdl.handle.net/1721.1/147511
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.