| dc.contributor.advisor | Liang, Paul | |
| dc.contributor.author | Chen, Peilin | |
| dc.date.accessioned | 2025-09-18T14:28:56Z | |
| dc.date.available | 2025-09-18T14:28:56Z | |
| dc.date.issued | 2025-05 | |
| dc.date.submitted | 2025-06-23T14:01:22.587Z | |
| dc.identifier.uri | https://hdl.handle.net/1721.1/162719 | |
| dc.description.abstract | We present a multimodal clinical AI framework that integrates time series, images, and text to support robust diagnostic reasoning across diverse input combinations. We first introduce ECG-JEPA, a self-supervised encoder pretrained on multiple ECG datasets to learn generalizable time series representations. This unimodal pretraining improves ECG classification, achieving a 23-point AUC gain on the underrepresented Ga dataset. We then align and fuse these ECG embeddings with chest X-rays and EHR text using a vision–language model backbone, enabling end-to-end multimodal inference. Our results show that incorporating ECG signals meaningfully improves diagnostic performance, highlighting the value of multitask time series pretraining and modular fusion for clinical AI. | |
| dc.publisher | Massachusetts Institute of Technology | |
| dc.rights | In Copyright - Educational Use Permitted | |
| dc.rights | Copyright retained by author(s) | |
| dc.rights.uri | https://rightsstatements.org/page/InC-EDU/1.0/ | |
| dc.title | Self-Supervised ECG Learning for Multimodal Clinical Tasks | |
| dc.type | Thesis | |
| dc.description.degree | M.Eng. | |
| dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
| mit.thesis.degree | Master | |
| thesis.degree.name | Master of Engineering in Electrical Engineering and Computer Science | |