Non-Parametric Analyses of the Regulatory Roles of LINE-1 Retrotransposons during Motor Neuron Differentiation
Author(s)
Park, Hyunjin
DownloadThesis PDF (8.737Mb)
Advisor
Gifford, David K.
Terms of use
Metadata
Show full item recordAbstract
Background: Repetitive elements make up a large portion of eukaryotic genomes, constituting two-thirds of human genome. Although their functional importance were recognized as early as 1950s, study of their functions is stagnated despite the advancements in next-generation sequencing due to difficulty in establishing the identities of specific elements involved in a biological process of interest, e.g. transcription factor (TF) binding, from functional data modalities such as ChIP-seq and ChIA-PET.
Results: First, I present a non-parametric, k-mer based method that overcomes analysis ambiguities introduced by short read multimapping and the incompleteness of reference genomes in low-complexity regions. I use this method to elucidate inferential evidence for the cell type-specific binding of transcription factors to specific L1 subfamilies from ChIP-seq datasets. Second, I applied a method named Mates of Chimera (MoC) to identify L1-derived extrachromosomal circular DNAs (eccDNAs) from Circulome-seq datasets. I characterized differential eccDNA compositions in ESC and MN cell types and found differential enrichment of transcription factor binding motifs in cell type specific eccDNAs. Third, I present inferential evidence consistent with the hypothesis that some low-complexity regions may participate in chromatin interactions with cis-regulatory sequences in a cell-type specific manner analogous to enhancer-promoter interactions.
Conclusion: The thesis elucidates a set of functional hypotheses concerning putative regulatory roles of repetitive elements, L1 elements in particular, that may be extrachromosomal. I base my hypotheses on a wide range of available data modalities including whole-genome sequencing, ChIP-seq, Circulome-seq, and ChIA-PET through non-parametric, k-mer based methods that do not rely on exact read alignment coordinates.
Date issued
2022-02Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology