Show simple item record

dc.contributor.advisorRoger G. Mark.en_US
dc.contributor.authorDouglass, Margaret, 1981-en_US
dc.contributor.otherMassachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2006-07-13T15:13:32Z
dc.date.available2006-07-13T15:13:32Z
dc.date.copyright2005en_US
dc.date.issued2005en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/33299
dc.descriptionThesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.en_US
dc.descriptionIncludes bibliographical references (leaves 67-70).en_US
dc.description.abstractMedical researchers are legally required to protect patients' privacy by removing personally identifiable information from medical records before sharing the data with other researchers. Different computer-assisted methods are evaluated for removing and replacing protected health information (PHI) from free-text nursing notes collected in the hospital intensive care unit. A semi-automated method was developed to allow clinicians to highlight PHI on the screen of a tablet PC and to compare and combine the selections of different experts reading the same notes. Expert adjudication demonstrated that inter-human variability was high, with few false positives and many false negatives. A preliminary automated de-identification algorithm generated few false negatives but many false positives. A second automated algorithm was developed using the successful portions of the first algorithm and incorporating other heuristic methods to improve overall performance. A large de-identified collection of nursing notes was re-identified with realistic surrogate (but unprotected) dates, serial numbers, names, and phrases to form a "gold standard" reference database of over 2600 notes (approximately 340,000 words) with over 1800 labeled instances of PHI. This gold standard database of nursing notes and the Java source code used to evaluate algorithm performance will be made freely available on the Physionet web site in order to facilitate the development and validation of future de-identification algorithms.en_US
dc.description.statementofresponsibilityby Margaret Douglass.en_US
dc.format.extent70 leavesen_US
dc.format.extent3923649 bytes
dc.format.extent3926254 bytes
dc.format.mimetypeapplication/pdf
dc.format.mimetypeapplication/pdf
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleComputer-assisted de-identification of free-text nursing notesen_US
dc.typeThesisen_US
dc.description.degreeM.Eng.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc62279367en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record