Learning to prevent healthcare-associated infections : leveraging data across time and space to improve local predictions

Wiens, Jenna Marleau

Author(s)

Wiens, Jenna Marleau

DownloadFull printable version (14.31Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

John Guttag.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

The proliferation of electronic medical records holds out the promise of using machine learning and data mining to build models that will help healthcare providers improve patient outcomes. However, building useful models from these datasets presents many technical problems. Among the challenges are the large number of factors (both intrinsic and extrinsic) influencing a patient's risk of an adverse outcome, the inherent evolution of that risk over time, and the relative rarity of adverse outcomes, institutional differences and the lack of ground truth. In this thesis we tackle these challenges in the context of predicting healthcare-associated infections (HAIs). HAIs are a serious problem in US acute care hospitals, affecting approximately 4% of all inpatients on any given day. Despite best efforts to reduce incidence, HAIs remain stubbornly prevalent. We hypothesize that one of the reasons why is lack of an effective clinical tool for accurately measuring patient risk. Therefore, we develop accurate models for predicting which patients are at risk of acquiring an infection with Clostridium difficile (a common HAI). In contrast to previous work, we take a novel data-centric approach, leveraging the contents of EMRs from over 100,000 hospital admissions. We show how, by adapting techniques from time-series classification, transfer learning and multitask learning, we can learn more accurate models for patient risk stratification. Our model, based on thousands of variables both time-varying and time-invariant, does not remain static but changes over the course of a patient admission. Applied to a held-out validation set of 25,000 patient admissions. our model achieved an area under the receiver operating characteristic curve of 0.81 (95% CI 0.78-0.84). The model has been successfully integrated into the health record system at a large hospital in the US, and is being used to produce daily risk estimates for each inpatient. While more complex than traditional risk stratification methods, the widespread development and use of such data-driven models could ultimately enable cost-effective, targeted prevention strategies that reduce the incidence of HAIs.

Description

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.

148

Cataloged from PDF version of thesis.

Includes bibliographical references (pages 136-149).

Date issued

2014

URI

http://hdl.handle.net/1721.1/91103

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Doctoral Theses