Machine Learning for Sepsis Prognosis: Prediction Models and Dissecting Electronic Health Records

Liao, Wei

Author(s)

Liao, Wei

DownloadThesis PDF (52.60Mb)

Advisor

Voldman, Joel

Terms of use

In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Sepsis is the body's extreme response to an infection. It is a life-threatening medical emergency. Given the heavy burden sepsis has posed on the health care system, extensive research in the area has been performed to facilitate sepsis diagnosis. Sepsis prognosis can support the assessment of the likely progression of the disease and thus inform treatment decisions, but it is much less explored. Here I present two approaches to build sepsis prognosis models. First, I introduced the idea of assessing neutrophil function from simple-to-obtain phase microscopy images. I developed an experimental pipeline using measurement of reactive oxygen species genera=on as a label of neutrophil function. I generated a large neutrophil imaging dataset and explored different deep learning approaches to predict neutrophil activation state. Second, I developed machine learning models to predict sepsis patient future clinical score using electronic health records. As part of the effort, I developed a multidatabase extraction pipeline to facilitate electronic health records extraction process. My work demonstrates the potential of using deep learning models to evaluate functional aspects of the immune system and to predict sepsis patient future state, which could provide significant insight into sepsis prognostic monitoring and is easy to adapt in clinical settings. It is of great significance to understand the input data in developing reliable and generalizable machine learning for healthcare models. It is also increasingly apparent that machine learning for healthcare models can predict patient sensitive information from data that does not explicitly encode it. However, we lack a clear understanding of the extent of the problem: what types of sensitive information can be predicted and how it generalizes to different models or different datasets. We lack approaches to develop models that can make clinical inferences but not infer sensitive information. Critically, we also lack approaches to explain such data encoding. Using electronic health records, I thoroughly investigated the ability of machine learning models to encode a wide range of patient sensitive information. I developed a strategy to ensure that clinical prediction is minimally based on patient-sensitive information. I presented an approach that can explain feature importance in patient sensitive information encoding. This set of studies not only allows us to gain deep understanding of the sepsis patient clinical score prediction model but also are applicable to a variety of machine learning models utilizing time-series data.

Date issued

2024-05

URI

https://hdl.handle.net/1721.1/156618

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Doctoral Theses