Interpretable machine learning methods for stroke prediction

Zhang, Rebecca

Author(s)

Zhang, Rebecca

Download1138021852-MIT.pdf (1.098Mb)

Other Contributors

Massachusetts Institute of Technology. Operations Research Center.

Advisor

Dimitris Bertsimas.

Terms of use

MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Machine learning has long been touted as the next big tool, revolutionizing scientific endeavors as well as impacting industries like retail and finance. Naturally, there is much interest in the potential of next improving healthcare. However, using traditional machine learning approaches in this domain has many difficulties, chief among which is the issue of interpretability. We focus on the medical condition of stroke, a particularly desirable problem to target because it is one of the most prevalent and yet preventable conditions affecting Americans today. In this thesis, we apply novel interpretable prediction techniques to the problem of predicting stroke presence, location, acuity, and mortality risk for patient populations at two different hospital systems. We show that using an interpretable, optimal tree-based approach is roughly as effective if not better than black-box approaches. Using the clinical learnings from these studies, we explore new interpretable methodologies designed with medical applications and their unique challenges in mind. We present a novel regression algorithm to predict outcomes when the population is comprised of notably different subpopulations, and demonstrate that this gives comparable performance with improved interpretability. Finally, we explore new natural language processing techniques for machine learning from text. We propose an alternate end-to- end framework for going from unprocessed textual data to predictions, with an interpretable linguistics-based approach to model words. Altogether, this work demonstrates the promise that new parsimonious, interpretable algorithms have in the domain of stroke and broader healthcare problems.

Description

This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.

Thesis: S.M., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2019

Cataloged from student-submitted PDF version of thesis.

Includes bibliographical references (pages 70-75).

Date issued

2019

URI

https://hdl.handle.net/1721.1/123710

Department

Massachusetts Institute of Technology. Operations Research Center; Sloan School of Management

Publisher

Massachusetts Institute of Technology

Keywords

Operations Research Center.

Collections

Graduate Theses