Show simple item record

dc.contributor.advisorSzolovits, Peter
dc.contributor.authorHsu, Tzu-Ming Harry
dc.date.accessioned2023-01-19T19:52:33Z
dc.date.available2023-01-19T19:52:33Z
dc.date.issued2022-09
dc.date.submitted2022-10-19T19:08:27.636Z
dc.identifier.urihttps://hdl.handle.net/1721.1/147469
dc.description.abstractData for modern medical imaging modeling is constrained by their high physical density, complex structure, insufficient annotation, heterogeneity across sites, long-tailed distribution of findings/conditions/diseases, and sparsely presented information. In this dissertation, to utilize the constrained data effectively, we employ various computationally driven and clinically driven techniques, including cross-modal learning, deep reinforcement learning, transfer learning, federated learning, surrogate endpoint modeling, and clinical knowledge infusion. The techniques are demonstrated in a variety of applications, such as risk stratification for pancreatic cancer patients, COVID-19 severity risk assessment, cross-modal X-ray image and report retrieval, X-ray finding report generation from an image, orthopantomogram finding summarization and real-world federated learning benchmarking. In disease risk stratification applications, we develop an end-to-end body composition assessment system that quantifies fat and muscle amounts from 3-dimensional imaging studies with a two-step approach. The resulting body composition ratios for various tissues are then used to stratify risks in pancreatic cancer or COVID-19 patients. In the pancreatic cancer cohort, muscle loss is shown to be a good indicator of mortality risk; and in COVID-19 patients, visceral fat is more correlated with severity than body mass index is, despite the latter being the current go-to indicator. Following clinical applications related to body composition analysis, we take advantage of large-scale chest X-ray/report datasets to investigate how the association of the textual modality and the imaging modality can assist modeling. We explore the task of retrieval across radiographs and medical reports by learning a joint embedding space, and find that the retrieval performance can benefit from even a small amount of supervision. On the task of medical report generation, we attempt to describe clinical findings in a chest X-ray as radiologists do. While past works only consider language fluency but not clinical efficacy, we include both in our modeling process. The resulting models turn out to be, unsurprisingly, better at describing diseases and findings, which we identify to be a key trait for an AI system that aims to augment clinicians in their workflows. We then look at finding summarization from orthopantomogram, or, panoramic dental X-ray. The goal of the summarization is to localize teeth in the permanent dentition and tag them with labels of the six potential findings. To combine the modeling process with existing dental knowledge, we propose a new form of annotation that is quick to provide -- a set of 32 binary labels indicating the existence of each tooth. This annotation is used in a novel objective function for the system to optimize and is shown to improve finding summarization accuracy despite its simplicity compared to the pixel-wise supervision typically used in this task. Finally, we turn to inspect federated learning, which is a learning paradigm for medical institutions to collaboratively learn an AI model without exposing private patient data. As a precursor to medical imaging, we gather two large natural visual classification datasets on real-world scales, aiming to describe the impact of data heterogeneity on the performance of existing federated learning algorithms. Our results show that extreme data heterogeneity can greatly impact algorithms in their ability to classify visual patterns in federated learning setups, and the two novel solutions we bring to the table can somewhat alleviate the performance drop. We believe the conclusions can be extendable to medical imaging problems. To conclude the dissertation, we provide remarks on other important aspects that researchers in medical AI must consider before landing their applications in clinics, as well as some exciting yet under-explored research tracks in medical imaging. While the objective of this dissertation is to provide an extensive coverage of various methods that more effectively model medical imaging tasks when the available data are constrained, our explorations are not exhaustive. We hope the several research topics showcased in this dissertation inspire further research and can fuel explorations down the line, ultimately benefiting humanity on a civilization scale.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright retained by author(s)
dc.rights.urihttps://rightsstatements.org/page/InC-EDU/1.0/
dc.titleEffective Modeling in Medical Imaging with Constrained Data
dc.typeThesis
dc.description.degreePh.D.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.orcidhttps://orcid.org/0000-0001-7198-7832
mit.thesis.degreeDoctoral
thesis.degree.nameDoctor of Philosophy


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record