Learning to prevent healthcare-associated infections : leveraging data across time and space to improve local predictions
Author(s)
Wiens, Jenna Marleau
DownloadFull printable version (14.31Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
John Guttag.
Terms of use
Metadata
Show full item recordAbstract
The proliferation of electronic medical records holds out the promise of using machine learning and data mining to build models that will help healthcare providers improve patient outcomes. However, building useful models from these datasets presents many technical problems. Among the challenges are the large number of factors (both intrinsic and extrinsic) influencing a patient's risk of an adverse outcome, the inherent evolution of that risk over time, and the relative rarity of adverse outcomes, institutional differences and the lack of ground truth. In this thesis we tackle these challenges in the context of predicting healthcare-associated infections (HAIs). HAIs are a serious problem in US acute care hospitals, affecting approximately 4% of all inpatients on any given day. Despite best efforts to reduce incidence, HAIs remain stubbornly prevalent. We hypothesize that one of the reasons why is lack of an effective clinical tool for accurately measuring patient risk. Therefore, we develop accurate models for predicting which patients are at risk of acquiring an infection with Clostridium difficile (a common HAI). In contrast to previous work, we take a novel data-centric approach, leveraging the contents of EMRs from over 100,000 hospital admissions. We show how, by adapting techniques from time-series classification, transfer learning and multitask learning, we can learn more accurate models for patient risk stratification. Our model, based on thousands of variables both time-varying and time-invariant, does not remain static but changes over the course of a patient admission. Applied to a held-out validation set of 25,000 patient admissions. our model achieved an area under the receiver operating characteristic curve of 0.81 (95% CI 0.78-0.84). The model has been successfully integrated into the health record system at a large hospital in the US, and is being used to produce daily risk estimates for each inpatient. While more complex than traditional risk stratification methods, the widespread development and use of such data-driven models could ultimately enable cost-effective, targeted prevention strategies that reduce the incidence of HAIs.
Description
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014. 148 Cataloged from PDF version of thesis. Includes bibliographical references (pages 136-149).
Date issued
2014Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.