Show simple item record

dc.contributor.advisorUna-May O'Reilly and Kalyan Veeramachaneni.en_US
dc.contributor.authorEzeozue, Chidube Donalden_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2014-04-25T15:48:12Z
dc.date.available2014-04-25T15:48:12Z
dc.date.copyright2013en_US
dc.date.issued2013en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/86273
dc.descriptionThesis: S.M. in Technology and Policy, Massachusetts Institute of Technology, Engineering Systems Division, Technology and Policy Program, 2013.en_US
dc.descriptionThesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2013.en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 97-101).en_US
dc.description.abstractAn intersection of events has led to a massive increase in the amount of medical data being collected from patients inside and outside the hospital. These events include the development of new sensors, the continuous decrease in the cost of data storage, the development of Big Data algorithms in other domains and the Health Information Technology for Economic and Clinical Health (HITECH) Act's $20 billion incentive for hospitals to install and use Electronic Health Record (EHR) systems. The data being collected presents an excellent opportunity to improve patient care. However, this opportunity is not without its challenges. Some of the challenges are technical in nature, not the least of which is how to efficiently process such massive amounts of data. At the other end of the spectrum, there are policy questions that deal with data privacy, confidentiality and ownership to ensure that research continues unhindered while preserving the rights and interests of the stakeholders involved. This thesis addresses both ends of the challenge spectrum. First of all, we design and implement a number of methods for automatically discovering groups within large amounts of data, otherwise known as clustering. We believe this technique would prove particularly useful in identifying patient states, segregating cohorts of patients and hypothesis generation. Specifically, we scale a popular clustering algorithm, Expectation-Maximization (EM) for Gaussian Mixture Models to be able to run on a cloud of computers. We also give a lot of attention to the idea of Consensus Clustering which allows multiple clusterings to be merged into a single ensemble clustering. Here, we scale one existing consensus clustering algorithm, which relies on EM for multinomial mixture models. We also develop and implement a more general framework for retrofitting any consensus clustering algorithm and making it amenable to streaming data as well as distribution on a cloud. On the policy end of the spectrum, we argue that the issue of data ownership is essential and highlight how the law in the United States has handled this issue in the past several decades, focusing on common law and state law approaches. We proceed to identify the flaws, especially the fragmentation, in the current system and make recommendations for a more equitable and efficient policy stance. The recommendations center on codifying the policy stance in Federal Law and allocating the property rights of the data to both the healthcare provider and the patient.en_US
dc.description.statementofresponsibilityby Chidube Donald Ezeozue.en_US
dc.format.extent101 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectEngineering Systems Division.en_US
dc.subjectTechnology and Policy Program.en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleLarge-scale consensus clustering and data ownership considerations for medical applicationsen_US
dc.typeThesisen_US
dc.description.degreeS.M. in Technology and Policyen_US
dc.description.degreeS.M.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.contributor.departmentMassachusetts Institute of Technology. Engineering Systems Division
dc.contributor.departmentTechnology and Policy Program
dc.identifier.oclc874576898en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record