Privacy preserving framework for federated learning in genomics
Author(s)
Kokje, Yashashree.
Download1262994244-MIT.pdf (1.428Mb)
Other Contributors
Massachusetts Institute of Technology. Engineering and Management Program.
System Design and Management Program.
Terms of use
Metadata
Show full item recordAbstract
With the advent of machine learning, organizations today collect and process data at an unprecedented scale. This has led to rapid growth in innovation across industries, but also poses numerous challenges around maintaining user privacy. Specifically, in the field of healthcare and genomics where data is highly sensitive. Unlike credit cards or passwords, one's genomic information cannot be modified at will and has the ability to uniquely identify the individual. The objective of this thesis is to develop an easily configurable framework that would allow organizations to collaborate and advance genomic research without directly sharing user data with each other. This thesis includes the development of a privacy preserving framework for federated learning on genomic datasets that are distributed across organizational silos. PAGe (Privacy Aware Genomics) has been open-sourced and has a low barrier to entry. A packaged runtime environment is available that includes popular bioinformatics tools and machine learning libraries. Experimental setup is controlled through configuration files, allowing users to easily terminate, restart or reproduce results. Finally, there is an in depth evaluation of the framework using Type 2 Diabetes disease risk prediction as a case study with the 1000 genomes dataset as input.
Description
Thesis: S.M. in Engineering and Management, Massachusetts Institute of Technology, System Design and Management Program, May, 2020 Cataloged from the official version of thesis. Includes bibliographical references (pages 57-59).
Date issued
2020Department
Massachusetts Institute of Technology. Engineering and Management ProgramPublisher
Massachusetts Institute of Technology
Keywords
Engineering and Management Program., System Design and Management Program.