Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues
Author(s)
Ernst, Jason; Kellis, Manolis
DownloadKellis_Large-scale.pdf (5.451Mb)
OPEN_ACCESS_POLICY
Open Access Policy
Creative Commons Attribution-Noncommercial-Share Alike
Terms of use
Metadata
Show full item recordAbstract
With hundreds of epigenomic maps, the opportunity arises to exploit the correlated nature of epigenetic signals, across both marks and samples, for large-scale prediction of additional datasets. Here, we undertake epigenome imputation by leveraging such correlations through an ensemble of regression trees. We impute 4,315 high-resolution signal maps, of which 26% are also experimentally observed. Imputed signal tracks show overall similarity to observed signals and surpass experimental datasets in consistency, recovery of gene annotations and enrichment for disease-associated variants. We use the imputed data to detect low-quality experimental datasets, to find genomic sites with unexpected epigenomic signals, to define high-priority marks for new experiments and to delineate chromatin states in 127 reference epigenomes spanning diverse tissues and cell types. Our imputed datasets provide the most comprehensive human regulatory region annotation to date, and our approach and the ChromImpute software constitute a useful complement to large-scale experimental mapping of epigenomic information.
Date issued
2015-02Department
Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer ScienceJournal
Nature Biotechnology
Publisher
Nature Publishing Group
Citation
Ernst, Jason, and Manolis Kellis. “Large-Scale Imputation of Epigenomic Datasets for Systematic Annotation of Diverse Human Tissues.” Nature Biotechnology 33, no. 4 (February 18, 2015): 364–376.
Version: Author's final manuscript
ISSN
1087-0156
1546-1696