Applying domain knowledge to clinical predictive models

Liu, Yun, Ph. D. Massachusetts Institute of Technology

dc.contributor.advisor	Collin M. Stultz and John V. Guttag.	en_US
dc.contributor.author	Liu, Yun, Ph. D. Massachusetts Institute of Technology	en_US
dc.contributor.other	Harvard--MIT Program in Health Sciences and Technology.	en_US
dc.date.accessioned	2016-09-30T18:25:08Z
dc.date.available	2016-09-30T18:25:08Z
dc.date.copyright	2016	en_US
dc.date.issued	2016	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/104469
dc.description	Thesis: Ph. D. in Medical Engineering, Harvard-MIT Program in Health Sciences and Technology, 2016.	en_US
dc.description	This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.	en_US
dc.description	Cataloged from student-submitted PDF version of thesis.	en_US
dc.description	Includes bibliographical references (pages 115-124).	en_US
dc.description.abstract	Clinical predictive models are useful in predicting a patient's risk of developing adverse outcomes and in guiding patient therapy. In this thesis, we explored two different ways to apply domain knowledge to improve clinical predictive models. We first applied knowledge about the heart to engineer better frequency-domain features from electrocardiograms (ECG). The standard frequency domain (in Hz) quantifies events that repeat with respect to time. However, this may be misleading because patients have different heart rates. We hypothesized that quantifying frequency with respective to heartbeats may adjust for these heart rate differences. We applied this beat-frequency to improve two existing ECG predictive models, one based on ECG morphology, and the other based on instantaneous heart rate. We then used machine learning to find predictive frequency bands. When evaluated on thousands of patients after an acute coronary syndrome, our method significantly improved prediction performance (e.g., area under curve, AUC, from 0.70 to 0.75). In addition, the same bands were found to be predictive in different patients for beat-frequency, but not for the standard frequency domain. Next, we developed a method to transfer knowledge from published biomedical articles to improve predictive models when training data are scarce. We used this knowledge to estimate the relevance of features to a given outcome, and used these estimates to improve feature selection. We applied our method to predict the onset of several cardiovascular diseases, using training data that contained only 50 adverse outcomes. Relative to a standard approach (which does not transfer knowledge), our method significantly improved the AUC from 0.66 to 0.70. In addition, our method selected 60% fewer features, improving interpretability of the model by experts, which is a key requirement for models to see real-world use.	en_US
dc.description.statementofresponsibility	by Yun Liu.	en_US
dc.format.extent	124 pages	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Harvard--MIT Program in Health Sciences and Technology.	en_US
dc.title	Applying domain knowledge to clinical predictive models	en_US
dc.type	Thesis	en_US
dc.description.degree	Ph. D. in Medical Engineering	en_US
dc.contributor.department	Harvard University--MIT Division of Health Sciences and Technology
dc.identifier.oclc	958999325	en_US

Files in this item

Name:: 958999325-MIT.pdf
Size:: 3.312Mb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record