MIT Libraries homeMIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Multimodal Representation Learning for Medical Image Analysis

Author(s)
Liao, Ruizhi
Thumbnail
DownloadThesis PDF (26.45Mb)
Advisor
Golland, Polina
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
My thesis develops machine learning methods that exploit multimodal clinical data to improve medical image analysis. Medical images capture rich information of a patient’s physiological and disease status, central in clinical practice and research. Computational models, such as artificial neural networks, enable automatic and quantitative medical image analysis, which may offer timely diagnosis in low-resource settings, advance precision medicine, and facilitate large-scale clinical research. Developing such image models demands large training data. Although digital medical images have become increasingly available, limited structured image labels for the image model training have remained a bottleneck. To overcome this challenge, I have built machine learning algorithms for medical image model development by exploiting other clinical data. Clinical data is often multimodal, including images, text (e.g., radiology reports, clinical notes), and numerical signals (e.g., vital signs, laboratory measurements). These multimodal sources of information reflect different yet correlated manifestations of a subject’s underlying physiological processes. I propose machine learning methods that take advantage of the correlations between medical images and other clinical data to yield accurate computer vision models. I use mutual information to capture the correlations and develop novel algorithms for multimodal representation learning by leveraging local data features. The experiments described in this thesis demonstrate the advances of the multimodal learning approaches in the application of chest x-ray analysis.
Date issued
2021-09
URI
https://hdl.handle.net/1721.1/139925
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries homeMIT Libraries logo

Find us on

Twitter Facebook Instagram YouTube RSS

MIT Libraries navigation

SearchHours & locationsBorrow & requestResearch supportAbout us
PrivacyPermissionsAccessibility
MIT
Massachusetts Institute of Technology
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.