Show simple item record

dc.contributor.advisorCaroline Uhler.en_US
dc.contributor.authorSun, Lawrence (Lawrence J.)en_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2018-12-11T20:40:33Z
dc.date.available2018-12-11T20:40:33Z
dc.date.copyright2018en_US
dc.date.issued2018en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/119570
dc.descriptionThesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2018.en_US
dc.descriptionThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.en_US
dc.descriptionCataloged from student-submitted PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 61-62).en_US
dc.description.abstractThe spatial organization of DNA in the cell nucleus plays an important role for gene regulation, DNA replication, and genomic integrity. Through the development of chromosome capture experiments (such as 3C, 4C, Hi-C) it is now possible to obtain the contact frequencies of the DNA at the whole-genome level. In this thesis, we study the problem of reconstructing the 3D organization of the genome from whole-genome contact frequencies. A standard approach is to transform the contact frequencies into noisy distance measurements and then apply semidefinite programming (SDP) formulations to obtain the 3D configurations. However, neglected in such reconstructions is the fact that most eukaryotes including humans are diploid and therefore contain two (from the available data) indistinguishable copies of each genomic locus. Due to this, the standard approach performs very poorly on diploid organisms. We prove that the 3D organization of the DNA is not identifiable from exclusively chromosome capture data for diploid organisms. In fact, there are infinitely many solutions even in the noise-free setting. We then discuss various additional biologically relevant constraints (including distances between neighboring genomic loci and to the nucleus center or higher-order interactions). Under these conditions we prove there are finitely many solutions and conjecture we in fact have identifiability. Finally, we provide SDP formulations for computing the 3D embedding of the DNA with these additional constraints and show that we can recover the true 3D embedding with high accuracy even under noise.en_US
dc.description.statementofresponsibilityby Lawrence Sun.en_US
dc.format.extent62 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsMIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleInference of 3D structure of diploid chromosomesen_US
dc.title.alternativeInference of three-dimensional structure of diploid chromosomesen_US
dc.title.alternativeInference of three-D structure of diploid chromosomesen_US
dc.typeThesisen_US
dc.description.degreeM. Eng.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc1076344608en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record