MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT RAISE (Responsible AI for Social Empowerment and Education)
  • MIT RAISE Publications
  • View Item
  • DSpace@MIT Home
  • MIT RAISE (Responsible AI for Social Empowerment and Education)
  • MIT RAISE Publications
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Evaluating the Spatial Reasoning Capabilities of Large Multimodal Models on Chest X-Ray Anomaly Detection

Author(s)
Li, Linday Skylar
Thumbnail
DownloadSkylar-Li-MIT-AIEDU-Final-Paper.pdf (2.044Mb)
Metadata
Show full item record
Abstract
While current results show potential in LMM-based diagnosis, it is unclear if the output of them are backed by strong spatial reasoning capabilities. To evaluate this, I provided GPT-4o with chest X-rays and asked it to return diagnoses and the coordinates of bounding boxes that surrounded any identified abnormalities on the NIH chest X-ray dataset. I find variable performance across different images in the dataset, suggesting the need for further development of the spatial reasoning capabilities of LMMs.
Date issued
2025-07
URI
https://hdl.handle.net/1721.1/163147
Journal
2025 MIT AI and Education Summit

Collections
  • MIT RAISE Publications

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.