Policy-Based Access Control in Federated Clinical Question Answering
Author(s)
Chen, Alice
DownloadThesis PDF (1.534Mb)
Advisor
Kagal, Lalana
Terms of use
Metadata
Show full item recordAbstract
Retrieval augmented generation (RAG) has recently expanded large language model versatility in answering domain-specific questions using dynamic external knowledge bases, particularly demonstrating promise in assisting clinical settings. However, due to its sensitive nature, patient medical data often requires retrieval to be federated across a decentralized network of hospital institutions, each maintaining internal databases and access control policies. Applying standard RAG to clinical question-answering tasks is complicated by the lack of an interface for hospital resource owners to regulate and restrict access to sensitive clinical documents during retrieval, which is essential for model feasibility in practice. We propose to leverage federated RAG retrieval for clinical trends inference across distributed medical records while adding authorization security mechanisms during retrieval to guarantee security of patient data. We propose (i) user identity authentication administered through a trusted federation of per-hospital OpenID Connect servers, (ii) a framework for integrating policy-based access control (PBAC) security mechanisms at flexible granularity into a federated RAG system to restrict medical data access based on user role attributes, and (iii) ClinicalTrendQA, a novel dataset to evaluate model performance for synthesizing clinical trends grounded on decentralized patient EHR information. To facilitate evaluation of our authorization PBAC framework on protecting information leakage during retrieval, we additionally present a federated 3-hospital case study and demonstrate that the same ClinicalTrendQA query under different user profiles holding varying degrees of access privileges observes the expected EHR information reduction. We also analyze metrics concerning the impact of this retrieval loss on end-to-end response quality against federated insecure and centralized RAG baselines.
Date issued
2024-05Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology