Multimodal Representation Learning for Medical Image Analysis

Liao, Ruizhi

dc.contributor.advisor	Golland, Polina
dc.contributor.author	Liao, Ruizhi
dc.date.accessioned	2022-02-07T15:13:01Z
dc.date.available	2022-02-07T15:13:01Z
dc.date.issued	2021-09
dc.date.submitted	2021-09-21T19:30:45.579Z
dc.identifier.uri	https://hdl.handle.net/1721.1/139925
dc.description.abstract	My thesis develops machine learning methods that exploit multimodal clinical data to improve medical image analysis. Medical images capture rich information of a patient’s physiological and disease status, central in clinical practice and research. Computational models, such as artificial neural networks, enable automatic and quantitative medical image analysis, which may offer timely diagnosis in low-resource settings, advance precision medicine, and facilitate large-scale clinical research. Developing such image models demands large training data. Although digital medical images have become increasingly available, limited structured image labels for the image model training have remained a bottleneck. To overcome this challenge, I have built machine learning algorithms for medical image model development by exploiting other clinical data. Clinical data is often multimodal, including images, text (e.g., radiology reports, clinical notes), and numerical signals (e.g., vital signs, laboratory measurements). These multimodal sources of information reflect different yet correlated manifestations of a subject’s underlying physiological processes. I propose machine learning methods that take advantage of the correlations between medical images and other clinical data to yield accurate computer vision models. I use mutual information to capture the correlations and develop novel algorithms for multimodal representation learning by leveraging local data features. The experiments described in this thesis demonstrate the advances of the multimodal learning approaches in the application of chest x-ray analysis.
dc.publisher	Massachusetts Institute of Technology
dc.rights	In Copyright - Educational Use Permitted
dc.rights	Copyright retained by author(s)
dc.rights.uri	https://rightsstatements.org/page/InC-EDU/1.0/
dc.title	Multimodal Representation Learning for Medical Image Analysis
dc.type	Thesis
dc.description.degree	Ph.D.
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.orcid	https://orcid.org/0000-0001-6761-921X
mit.thesis.degree	Doctoral
thesis.degree.name	Doctor of Philosophy

Files in this item

Name:: Liao-ruizhi-PhD-EECS-2021-thes ...
Size:: 26.45Mb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record