| dc.description.abstract | Modern medicine is facing a fundamental shift with the increasing availability of large-scale electronic health data and artificial intelligence-based technologies. In particular, integration across different patient characteristics to optimize, learn, and plan simultaneously across multiple medical tasks of interest, or what we call system medicine, provides opportunities for clinical and operational systems to improve disease diagnosis, operational efficiency, and, most importantly, clinical understanding. This thesis aims to develop and validate novel methods using artificial intelligence and optimization to address challenges faced in this domain.
We introduce general-purpose artificial intelligence frameworks in Part 1. First, we introduce Holistic Artificial Intelligence in Medicine (HAIM), an integrated pipeline that combines multimodal data ranging from tabular, time-series, vision, and language into a single framework for downstream task learning. We then develop Multimodal Multitask Machine Learning for Healthcare (M3H), an end-to-end, many-to-many framework that joins the learning of multimodal data with a diverse set of medical and machine learning problem tasks. This work proposed a novel attention mechanism as well as a new explainability metric that extends previous works on the evaluation of input space contributions (features) to the output space (outcomes). These works are actively being incorporated to improve diagnosis in cardiovascular and oncology studies using ECG and multi-omics data.
We then address real-world adoption concerns to design responsible machine learning models using optimization in Part 2. We first introduce robust regression under averaged uncertainty, which yields exact, closed-form, and analytical solutions that recover ridge regression. We show insight into how the geometric properties of the uncertainty set are closely linked to the regularization strength of the equivalent ridge regression. We then proposed an adaptive, data-driven approach for personalized breast cancer screening scheduling, which integrates an ML-based survival prediction model and a stochastic optimization formulation that balances screening delay and screening frequency.
Finally, we apply predictive and prescriptive analytic methods to improve general medical outcomes in Part 3 and Part 4, respectively. These studies range from oncology, trauma, cardiovascular, and logistics planning. In Part 3, we aim to develop models that can most accurately learn the outcome. We show that predictive methods across different machine learning methodologies, including deep neural networks for computer vision tasks, tree-based models (including Optimal Classification Trees and gradient boosted trees), can significantly improve over either existing benchmarks or achieve comparable performances with manual physician practice. In Part 4, we delve into prescriptive analytics, which focuses on assigning the optimal treatment or other clinical decision to achieve the best outcome. We apply the interpretable Optimal Policy Trees methodology across oncology and trauma settings and observe improved medical outcomes (i.e., mortality rate). | |