Building and Evaluating Cancer Prescreening Models with Electronic Health Records
Author(s)
Saowakon, Pasapol
DownloadThesis PDF (1.208Mb)
Advisor
Rinard, Martin C.
Terms of use
Metadata
Show full item recordAbstract
Cancer is a leading cause of death that kills over ten million people every year, and many times delayed treatment is the culprit. Building on a recent framework, we used electronic health records from TriNetX to develop prescreening models for ten different cancer types: biliary tract, brain, breast (female), colon, esophageal, gastric, kidney, liver, lung, and ovarian. The models showed great performance, with neural network models consistently but marginally outperforming their logistic regression counterparts. As expected, we found that models trained to detect specific cancer types performed noticeably better than ones trained more generally to detect any cancer. All models proved to be reasonably robust in geographical, racial, and temporal external validations, although a prospective study is still needed to verify the performance and the potential impact of our models.
Date issued
2023-06Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology