Patient clustering using electronic medical records
Author(s)
Shea, Andrew(Andrew L.)
Download1193029579-MIT.pdf (5.885Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Manolis Kellis.
Terms of use
Metadata
Show full item recordAbstract
Electronic health records (EHR) and their wealth of patient health information present new opportunities for understanding relationships between patients and their conditions. However, EHR data sparsity, quality, and accessibility present various computational challenges. To address these challenges, we apply spectral clustering and variational autoencoders to obtain compact patient representations and clusters from EHR in an unsupervised manner. We apply these methods to the MIMIC dataset, from which we only use ICD-9 diagnostic codes to ensure data accessibility. After obtaining clusters, we conduct high-resolution analysis by examining the 5 most frequent phenotypes within each cluster. We then conduct low-resolution analysis by examining the distribution of phenotypes within each cluster, examining the relationships amongst the most prevalent phenotypes in each cluster by constructing a cluster network, and comparing our findings to existing medical literature. While preliminary, these results suggest that learning from sparse EHR data is sufficient for uncovering associations between conditions and diseases.
Description
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, May, 2020 Cataloged from the official PDF of thesis. Includes bibliographical references (pages 61-66).
Date issued
2020Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.