Show simple item record

dc.contributor.authorGourabathina, Abinitha
dc.contributor.authorGerych, Walter
dc.contributor.authorPan, Eileen
dc.contributor.authorGhassemi, Marzyeh
dc.date.accessioned2025-12-22T20:55:09Z
dc.date.available2025-12-22T20:55:09Z
dc.date.issued2025-06-23
dc.identifier.isbn979-8-4007-1482-5
dc.identifier.urihttps://hdl.handle.net/1721.1/164428
dc.descriptionFAccT ’25, Athens, Greeceen_US
dc.description.abstractThe integration of large language models (LLMs) into clinical diagnostics necessitates a careful understanding of how clinically irrelevant aspects of user inputs directly influence generated treatment recommendations and, consequently, clinical outcomes for end-users. Building on prior research that examines the impact of demographic attributes on clinical LLM reasoning, this study explores how non-clinically relevant attributes shape clinical decision-making by LLMs. Through the perturbation of patient messages, we evaluate whether LLM behavior remains consistent, accurate, and unbiased when non-clinical information is altered. These perturbations assess the brittleness of clinical LLM reasoning by replicating structural errors that may occur during electronic data processing patient questions and simulating interactions between patient-AI systems in diverse, vulnerable patient groups. Our findings reveal notable inconsistencies in LLM treatment recommendations and significant degradation of clinical accuracy in ways that reduce care allocation to patients. Additionally, there are significant disparities in treatment recommendations between gender subgroups as well as between model-inferred gender subgroups. We also apply our perturbation framework to a conversational clinical dataset to find that even in conversation, LLM clinical accuracy decreases post-perturbation, and disparities exist in how perturbations impact gender subgroups. By analyzing LLM outputs in response to realistic yet modified clinical contexts, our work deepens understanding of the sensitivity, inaccuracy, and biases inherent in medical LLMs, offering critical insights for the deployment of patient-AI systems.en_US
dc.publisherACM|The 2025 ACM Conference on Fairness, Accountability, and Transparencyen_US
dc.relation.isversionofhttps://doi.org/10.1145/3715275.3732121en_US
dc.rightsCreative Commons Attributionen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourceAssociation for Computing Machineryen_US
dc.titleThe Medium is the Message: How Non-Clinical Information Shapes Clinical Decisions in LLMsen_US
dc.typeArticleen_US
dc.identifier.citationAbinitha Gourabathina, Walter Gerych, Eileen Pan, and Marzyeh Ghassemi. 2025. The Medium is the Message: How Non-Clinical Information Shapes Clinical Decisions in LLMs. In Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency (FAccT '25). Association for Computing Machinery, New York, NY, USA, 1805–1828.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.identifier.mitlicensePUBLISHER_POLICY
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2025-08-01T08:34:53Z
dc.language.rfc3066en
dc.rights.holderThe author(s)
dspace.date.submission2025-08-01T08:34:53Z
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record