MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Stitching and sketching large-scale single-cell transcriptomic data

Author(s)
Hie, Brian.
Thumbnail
Download1102049692-MIT.pdf (18.79Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Bonnie Berger.
Terms of use
MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582
Metadata
Show full item record
Abstract
Researchers are generating single-cell RNA sequencing (scRNA-seq) profiles of diverse biological systems [1]-[7] and every cell type in the human body [8] at an unprecedented scale, with scRNA-seq experiments regularly profiling gene expression in hundreds of thousands or even millions of cells [9]. Leveraging this data to gain unprecedented insight into biology and disease requires algorithms that can scale to the tremendous amount of data being generated and can integrate information across multiple experiments, laboratories, and technologies. Here, we present two algorithms that aim to aid researchers in gaining better insight from scRNA-seq data sets. The first, Scanorama, inspired by algorithms for panorama stitching, achieves accurate integration of heterogeneous scRNA-seq data sets, which we use to integrate a number of large and complex collections of data sets. The second algorithm, geometric sketching, is a sampling approach that aims to evenly cover the low-dimensional manifold spanned by the cells to capture more of the rare transcriptional structure than would uniform subsampling with equal probability for each cell, obtaining sketches that better capture the transcriptional heterogeneity of the original data. Moreover, geometric sketching can be used to improve the computational efficiency of algorithms for single-cell integration, including Scanorama. We anticipate that both algorithms will play an important role in the analysis and interpretation of large-scale single-cell transcriptomic data sets.
Description
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019
 
Cataloged from PDF version of thesis.
 
Includes bibliographical references (pages 57-65).
 
Date issued
2019
URI
https://hdl.handle.net/1721.1/121734
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.