Matching data fragments with imperfect identifiers from disparate sources
Author(s)
Craig, Michael B.; Moody, Benjamin E.; Jia, Xiaoming; Villarroel, Mauricio C.; Mark, Roger G
DownloadMark_Matching Data.pdf (991.6Kb)
PUBLISHER_POLICY
Publisher Policy
Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.
Terms of use
Metadata
Show full item recordAbstract
The Multiparameter Intelligent Monitoring in Intensive Care (MIMIC-II) Database includes waveforms and derived parameters from bedside monitors, clinical data from an ICU information system, and data from other hospital laboratories and archives, for thousands of patients. These data come from devices under separate domains that often do not retain detailed information regarding relationships between parameters. We developed software for matching data fragments with incomplete and sometimes incorrect identifiers. We found that names, medical record numbers, waveform times and durations, and ICU admission and discharge records were most helpful when available; however, physiological data can also be used in some circumstances. Rule-based normalization and text edit-distance metrics are used in addition to a visual verification tool for patients whose records cannot be assembled automatically. Thus, a majority of the available waveform recordings are matched to patients in the clinical database.
Date issued
2010-09Department
Harvard University--MIT Division of Health Sciences and Technology; Harvard--MIT Program in Health Sciences and Technology. Laboratory for Computational PhysiologyJournal
Computing in Cardiology, 2010
Publisher
Institute of Electrical and Electronics Engineers
Citation
Craig, M.B. et al. “Matching data fragments with imperfect identifiers from disparate sources.” Computing in Cardiology, 2010. (2010) 37:793-796. © Copyright 2012 IEEE.
Version: Final published version
Other identifiers
INSPEC Accession Number: 11883802
ISBN
978-1-4244-7318-2
ISSN
0276-6547