Show simple item record

dc.contributor.authorAnanthakrishnan, Ashwin N.
dc.contributor.authorCai, Tianxi
dc.contributor.authorSavova, Guergana
dc.contributor.authorCheng, Su-Chun
dc.contributor.authorChen, Pei
dc.contributor.authorPerez, Raul Guzman
dc.contributor.authorGainer, Vivian
dc.contributor.authorMurphy, Shawn N.
dc.contributor.authorSzolovits, Peter
dc.contributor.authorXia, Zongqi
dc.contributor.authorShaw, Stanley
dc.contributor.authorChurchill, Susanne
dc.contributor.authorKarlson, Elizabeth W.
dc.contributor.authorKohane, Isaac
dc.contributor.authorPlenge, Robert M.
dc.contributor.authorLiao, Katherine P.
dc.date.accessioned2014-10-10T18:18:27Z
dc.date.available2014-10-10T18:18:27Z
dc.date.issued2013-06
dc.identifier.issn1078-0998
dc.identifier.issn1536-4844
dc.identifier.urihttp://hdl.handle.net/1721.1/90903
dc.descriptionavailable in PMC 2014 June 01en_US
dc.description.abstractBackground: Previous studies identifying patients with inflammatory bowel disease using administrative codes have yielded inconsistent results. Our objective was to develop a robust electronic medical record–based model for classification of inflammatory bowel disease leveraging the combination of codified data and information from clinical text notes using natural language processing. Methods: Using the electronic medical records of 2 large academic centers, we created data marts for Crohn’s disease (CD) and ulcerative colitis (UC) comprising patients with ≥1 International Classification of Diseases, 9th edition, code for each disease. We used codified (i.e., International Classification of Diseases, 9th edition codes, electronic prescriptions) and narrative data from clinical notes to develop our classification model. Model development and validation was performed in a training set of 600 randomly selected patients for each disease with medical record review as the gold standard. Logistic regression with the adaptive LASSO penalty was used to select informative variables. Results: We confirmed 399 CD cases (67%) in the CD training set and 378 UC cases (63%) in the UC training set. For both, a combined model including narrative and codified data had better accuracy (area under the curve for CD 0.95; UC 0.94) than models using only disease International Classification of Diseases, 9th edition codes (area under the curve 0.89 for CD; 0.86 for UC). Addition of natural language processing narrative terms to our final model resulted in classification of 6% to 12% more subjects with the same accuracy. Conclusions: Inclusion of narrative concepts identified using natural language processing improves the accuracy of electronic medical records case definition for CD and UC while simultaneously identifying more subjects compared with models using codified data alone.en_US
dc.description.sponsorshipNational Institutes of Health (U.S.) (NIH U54-LM008748)en_US
dc.description.sponsorshipAmerican Gastroenterological Associationen_US
dc.description.sponsorshipNational Institutes of Health (U.S.) (NIH K08 AR060257)en_US
dc.description.sponsorshipBeth Isreal Deaconess Medical Center (Katherine Swan Ginsburg Fund)en_US
dc.description.sponsorshipNational Institutes of Health (U.S.) (NIH R01-AR056768)en_US
dc.description.sponsorshipBurroughs Wellcome Fund (Career Award for Medical Scientists)en_US
dc.description.sponsorshipNational Institutes of Health (U.S.) (NIH U01-GM092691)en_US
dc.description.sponsorshipNational Institutes of Health (U.S.) (NIH R01-AR059648)en_US
dc.language.isoen_US
dc.publisherLippincott Williams & Wilkinsen_US
dc.relation.isversionofhttp://dx.doi.org/10.1097/MIB.0b013e31828133fden_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourcePMCen_US
dc.titleImproving Case Definition of Crohnʼs Disease and Ulcerative Colitis in Electronic Medical Records Using Natural Language Processingen_US
dc.typeArticleen_US
dc.identifier.citationAnanthakrishnan, Ashwin N., Tianxi Cai, Guergana Savova, Su-Chun Cheng, Pei Chen, Raul Guzman Perez, Vivian S. Gainer, et al. “Improving Case Definition of Crohnʼs Disease and Ulcerative Colitis in Electronic Medical Records Using Natural Language Processing.” Inflammatory Bowel Diseases 19, no. 7 (2013): 1411–1420.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.contributor.mitauthorSzolovits, Peteren_US
dc.relation.journalInflammatory Bowel Diseasesen_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dspace.orderedauthorsAnanthakrishnan, Ashwin N.; Cai, Tianxi; Savova, Guergana; Cheng, Su-Chun; Chen, Pei; Perez, Raul Guzman; Gainer, Vivian S.; Murphy, Shawn N.; Szolovits, Peter; Xia, Zongqi; Shaw, Stanley; Churchill, Susanne; Karlson, Elizabeth W.; Kohane, Isaac; Plenge, Robert M.; Liao, Katherine P.en_US
dc.identifier.orcidhttps://orcid.org/0000-0001-8411-6403
mit.licenseOPEN_ACCESS_POLICYen_US
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record