Show simple item record

dc.contributor.authorGehrmann, Sebastian
dc.contributor.authorLi, Yeran
dc.contributor.authorCarlson, Eric T.
dc.contributor.authorWu, Joy T.
dc.contributor.authorWelt, Jonathan
dc.contributor.authorFoote, John
dc.contributor.authorMoseley, Edward T.
dc.contributor.authorGrant, David W.
dc.contributor.authorTyler, Patrick D.
dc.contributor.authorDernoncourt, Franck
dc.contributor.authorCeli, Leo Anthony G.
dc.date.accessioned2018-04-24T17:50:16Z
dc.date.available2018-04-24T17:50:16Z
dc.date.issued2018-02
dc.date.submitted2017-06
dc.identifier.issn1932-6203
dc.identifier.urihttp://hdl.handle.net/1721.1/114939
dc.description.abstractThis is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. In secondary analysis of electronic health records, a crucial task consists in correctly identifying the patient cohort under investigation. In many cases, the most valuable and relevant information for an accurate classification of medical conditions exist only in clinical narratives. Therefore, it is necessary to use natural language processing (NLP) techniques to extract and evaluate these narratives. The most commonly used approach to this problem relies on extracting a number of clinician-defined medical concepts from text and using machine learning techniques to identify whether a particular patient has a certain condition. However, recent advances in deep learning and NLP enable models to learn a rich representation of (medical) language. Convolutional neural networks (CNN) for text classification can augment the existing techniques by leveraging the representation of language to learn which phrases in a text are relevant for a given medical condition. In this work, we compare concept extraction based methods with CNNs and other commonly used models in NLP in ten phenotyping tasks using 1,610 discharge summaries from the MIMIC-III database. We show that CNNs outperform concept extraction based methods in almost all of the tasks, with an improvement in F1-score of up to 26 and up to 7 percentage points in area under the ROC curve (AUC). We additionally assess the interpretability of both approaches by presenting and evaluating methods that calculate and extract the most salient phrases for a prediction. The results indicate that CNNs are a valid alternative to existing approaches in patient phenotyping and cohort identification, and should be further investigated. Moreover, the deep learning approach presented in this paper can be used to assist clinicians during chart review or support the extraction of billing codes from text by identifying and highlighting relevant phrases for various medical conditions.en_US
dc.description.sponsorshipNational Institute of Biomedical Imaging and Bioengineering (U.S.) (Grant R01 EB017205-01A1)en_US
dc.publisherPublic Library of Scienceen_US
dc.relation.isversionofhttp://dx.doi.org/10.1371/journal.pone.0192360en_US
dc.rightsCreative Commons Attribution 4.0 International Licenseen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourcePLoSen_US
dc.titleComparing deep learning and concept extraction based methods for patient phenotyping from clinical narrativesen_US
dc.typeArticleen_US
dc.identifier.citationGehrmann, Sebastian et al. “Comparing Deep Learning and Concept Extraction Based Methods for Patient Phenotyping from Clinical Narratives.” Edited by Jen-Hsiang Chuang. PLOS ONE 13, 2 (February 2018): e0192360 © 2018 Gehrmann et alen_US
dc.contributor.departmentMassachusetts Institute of Technology. Institute for Medical Engineering & Scienceen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.contributor.departmentMIT Critical Data (Laboratory)
dc.contributor.mitauthorDernoncourt, Franck
dc.contributor.mitauthorCeli, Leo Anthony G.
dc.relation.journalPLOS ONEen_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2018-04-20T17:52:56Z
dspace.orderedauthorsGehrmann, Sebastian; Dernoncourt, Franck; Li, Yeran; Carlson, Eric T.; Wu, Joy T.; Welt, Jonathan; Foote, John; Moseley, Edward T.; Grant, David W.; Tyler, Patrick D.; Celi, Leo A.en_US
dspace.embargo.termsNen_US
dc.identifier.orcidhttps://orcid.org/0000-0002-1119-1346
mit.licensePUBLISHER_CCen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record