Repository logo
Log in(current)
Repository logoMIT Open ScholarshipDSpace@MIT
  1. Home
  2. MIT Open Access Articles
  3. MIT Open Access Articles
  4. BROCKMAN: deciphering variance in epigenomic regulators by k-mer factorization

BROCKMAN: deciphering variance in epigenomic regulators by k-mer factorization

Thumbnail Image
Name

12859_2018_Article_2255.pdf

Size

2.03 MB

Format

Adobe PDF

Checksum (MD5)

a356b63d272f7bc105d9f81c8de88df6

Download all files submitted through automated deposit
art_3917073053761022030.zip (1.7 MB)
Author(s)
de Boer, Carl G.
•
Regev, Aviv
Date Issued
July 2018
Journal
BMC Bioinformatics
Publisher
BioMed Central
Citation
de Boer, Carl G., and Aviv Regev. “BROCKMAN: Deciphering Variance in Epigenomic Regulators by k-Mer Factorization.” BMC Bioinformatics, vol. 19, no. 1, Dec. 2018. © 2018 The Authors
Version
Final published version
Abstract
Background: Variation in chromatin organization across single cells can help shed important light on the mechanisms controlling gene expression, but scale, noise, and sparsity pose significant challenges for interpretation of single cell chromatin data. Here, we develop BROCKMAN (Brockman Representation Of Chromatin by K-mers in Mark-Associated Nucleotides), an approach to infer variation in transcription factor (TF) activity across samples through unsupervised analysis of the variation in DNA sequences associated with an epigenomic mark. Results: BROCKMAN represents each sample as a vector of epigenomic-mark-associated DNA word frequencies, and decomposes the resulting matrix to find hidden structure in the data, followed by unsupervised grouping of samples and identification of the TFs that distinguish groups. Applied to single cell ATAC-seq, BROCKMAN readily distinguished cell types, treatments, batch effects, experimental artifacts, and cycling cells. We show that each variable component in the k-mer landscape reflects a set of co-varying TFs, which are often known to physically interact. For example, in K562 cells, AP-1 TFs were central determinant of variability in chromatin accessibility through their variable expression levels and diverse interactions with other TFs. We provide a theoretical basis for why cooperative TF binding – and any associated epigenomic mark – is inherently more variable than non-cooperative binding. Conclusions: BROCKMAN and related approaches will help gain a mechanistic understanding of the trans determinants of chromatin variability between cells, treatments, and individuals. Keywords: Single-cell, Epigenome, Chromatin, scATAC-seq, K-mer, N-gram, Factorization, Decomposition, Clustering, Transcription factor
MIT Department
Massachusetts Institute of Technology. Department of Biology
Terms of Use
Creative Commons Attribution
http://creativecommons.org/licenses/by/4.0/
Persistent DSpace Link
http://hdl.handle.net/1721.1/116880
DOI of Published Version
https://doi.org/10.1186/s12859-018-2255-6
Repository logo
PrivacyPermissionsAccessibilityContact us
Repository logo
Notify us about copyright concerns.