Comparison of dispersion metrics for estimating transcriptional noise in single-cell RNA-seq data and applications to cardiomyocyte biology
Author(s)
Chen, Tina T.
DownloadThesis PDF (12.90Mb)
Advisor
Boyer, Laurie A.
Terms of use
Metadata
Show full item recordAbstract
Transcription is a dynamic process with a multitude of characteristics, including transcript level, burst frequency, amplitude, and variability. Single-cell RNA sequencing data analysis often focuses on comparing transcription levels. However, these analyses capture only a portion of the wealth of information conveyed by transcription. The quantification and analysis of transcriptional variability poses an opportunity to study transcription and gene regulation from a new angle. Transcriptional variability has already been implicated in a number of biological processes, including in immune system development and in aging. Yet, the most appropriate method for measuring transcriptional variability in single-cell data has remained relatively unclear. Here, we simulated single-cell data with varying dispersion and dataset size to assess the relative responsiveness of the Gini index, variance-to-mean ratio, variance, and Shannon entropy to variability in single-cell counts. We found that the variance-to-mean ratio scales approximately linearly with increasing dispersion, and that it is scale-invariant. The Gini index displayed paradoxical behavior, and Shannon entropy was not scale-invariant. Thus, we applied the variance-to-mean to measure transcriptional variability in two publicly available datasets studying congenital heart defects in mouse models. We first found that change in transcriptional variability does not correlate with gene characteristics such as transcript level and evolutionary gene age. We also found that using change in transcriptional variability to focus GSEA and TF motif enrichment analyses revealed both genes with known involvement in cardiomyopathy and new genes and pathways as potential targets for future study. Notably, many of the genes and pathways identified through transcriptional variability analysis were not found by differential expression analysis, suggesting that transcriptional variability can provide additional biologically relevant information beyond what is observed from studying mean expression alone.
Date issued
2025-05Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology