Show simple item record

dc.contributor.advisorBonnie Berger.en_US
dc.contributor.authorHie, Brian.en_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2019-07-17T20:59:03Z
dc.date.available2019-07-17T20:59:03Z
dc.date.copyright2019en_US
dc.date.issued2019en_US
dc.identifier.urihttps://hdl.handle.net/1721.1/121734
dc.descriptionThesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 57-65).en_US
dc.description.abstractResearchers are generating single-cell RNA sequencing (scRNA-seq) profiles of diverse biological systems [1]-[7] and every cell type in the human body [8] at an unprecedented scale, with scRNA-seq experiments regularly profiling gene expression in hundreds of thousands or even millions of cells [9]. Leveraging this data to gain unprecedented insight into biology and disease requires algorithms that can scale to the tremendous amount of data being generated and can integrate information across multiple experiments, laboratories, and technologies. Here, we present two algorithms that aim to aid researchers in gaining better insight from scRNA-seq data sets. The first, Scanorama, inspired by algorithms for panorama stitching, achieves accurate integration of heterogeneous scRNA-seq data sets, which we use to integrate a number of large and complex collections of data sets. The second algorithm, geometric sketching, is a sampling approach that aims to evenly cover the low-dimensional manifold spanned by the cells to capture more of the rare transcriptional structure than would uniform subsampling with equal probability for each cell, obtaining sketches that better capture the transcriptional heterogeneity of the original data. Moreover, geometric sketching can be used to improve the computational efficiency of algorithms for single-cell integration, including Scanorama. We anticipate that both algorithms will play an important role in the analysis and interpretation of large-scale single-cell transcriptomic data sets.en_US
dc.description.statementofresponsibilityby Brian Hie.en_US
dc.format.extent111 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsMIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleStitching and sketching large-scale single-cell transcriptomic dataen_US
dc.typeThesisen_US
dc.description.degreeS.M.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.identifier.oclc1102049692en_US
dc.description.collectionS.M. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Scienceen_US
dspace.imported2019-07-17T20:59:01Zen_US
mit.thesis.degreeMasteren_US
mit.thesis.departmentEECSen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record