Stitching and sketching large-scale single-cell transcriptomic data
Author(s)
Hie, Brian.
Download1102049692-MIT.pdf (18.79Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Bonnie Berger.
Terms of use
Metadata
Show full item recordAbstract
Researchers are generating single-cell RNA sequencing (scRNA-seq) profiles of diverse biological systems [1]-[7] and every cell type in the human body [8] at an unprecedented scale, with scRNA-seq experiments regularly profiling gene expression in hundreds of thousands or even millions of cells [9]. Leveraging this data to gain unprecedented insight into biology and disease requires algorithms that can scale to the tremendous amount of data being generated and can integrate information across multiple experiments, laboratories, and technologies. Here, we present two algorithms that aim to aid researchers in gaining better insight from scRNA-seq data sets. The first, Scanorama, inspired by algorithms for panorama stitching, achieves accurate integration of heterogeneous scRNA-seq data sets, which we use to integrate a number of large and complex collections of data sets. The second algorithm, geometric sketching, is a sampling approach that aims to evenly cover the low-dimensional manifold spanned by the cells to capture more of the rare transcriptional structure than would uniform subsampling with equal probability for each cell, obtaining sketches that better capture the transcriptional heterogeneity of the original data. Moreover, geometric sketching can be used to improve the computational efficiency of algorithms for single-cell integration, including Scanorama. We anticipate that both algorithms will play an important role in the analysis and interpretation of large-scale single-cell transcriptomic data sets.
Description
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019 Cataloged from PDF version of thesis. Includes bibliographical references (pages 57-65).
Date issued
2019Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.