Multi-Dimensional Evaluation Metrics for Chest X-Ray Reports
Author(s)
Rawat, Saumya
DownloadThesis PDF (1.964Mb)
Advisor
Szolovits, Peter
Terms of use
Metadata
Show full item recordAbstract
In the past few years, there has been abundant research in using machine learning to generate high quality radiology reports using the large MIMIC-CXR chest x-ray dataset. However, there has been little work focused on evaluating the quality of generated reports from a clinical perspective, where accuracy is the most important factor. Current evaluation metrics evaluate reports in one dimension. This work proposes the use of multiple dimensions (factual correctness, comprehensiveness, style, and overall quality) to better capture evaluation preferences of a clinical text generating model where preferences can differ based on the use case. This work also presents a dataset of radiologist rating annotations for generated and reference chest x-ray radiology reports. Lastly, it also creates an improved metric for the readability dimension by adding context awareness of frequent and acceptable medical terminology.
Date issued
2022-05Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology