Show simple item record

dc.contributor.authorKrismer, Konstantin
dc.contributor.authorHammelman, Jennifer
dc.contributor.authorGifford, David K
dc.date.accessioned2022-06-28T16:33:56Z
dc.date.available2022-06-28T16:33:56Z
dc.date.issued2022-04-28
dc.identifier.urihttps://hdl.handle.net/1721.1/143575
dc.description.abstractAbstract Motivation: Sequence models based on deep neural networks have achieved state-of-the-art performance on regulatory genomics prediction tasks, such as chromatin accessibility and transcription factor binding. But despite their high accuracy, their contributions to a mechanistic understanding of the biology of regulatory elements is often hindered by the complexity of the predictive model and thus poor interpretability of its decision boundaries. To address this, we introduce seqgra, a deep learning pipeline that incorporates the rule-based simulation of biological sequence data and the training and evaluation of models, whose decision boundaries mirror the rules from the simulation process. Results: We show that seqgra can be used to (i) generate data under the assumption of a hypothesized model of genome regulation, (ii) identify neural network architectures capable of recovering the rules of said model and (iii) analyze a model’s predictive performance as a function of training set size and the complexity of the rules behind the simulated data. Availability and implementation: The source code of the seqgra package is hosted on GitHub (https://github.com/gif ford-lab/seqgra). seqgra is a pip-installable Python package. Extensive documentation can be found at https:// kkrismer.github.io/seqgra.en_US
dc.language.isoen
dc.publisherOxford University Press (OUP)en_US
dc.relation.isversionof10.1093/bioinformatics/btac101en_US
dc.rightsCreative Commons Attribution 4.0 International licenseen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourceOxford University Pressen_US
dc.titleseqgra: principled selection of neural network architectures for genomics prediction tasksen_US
dc.typeArticleen_US
dc.identifier.citationKrismer, Konstantin, Hammelman, Jennifer and Gifford, David K. 2022. "seqgra: principled selection of neural network architectures for genomics prediction tasks." Bioinformatics, 38 (9).
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.contributor.departmentMassachusetts Institute of Technology. Department of Biological Engineering
dc.contributor.departmentMassachusetts Institute of Technology. Computational and Systems Biology Program
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.relation.journalBioinformaticsen_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2022-06-28T13:58:21Z
dspace.orderedauthorsKrismer, K; Hammelman, J; Gifford, DKen_US
dspace.date.submission2022-06-28T13:58:23Z
mit.journal.volume38en_US
mit.journal.issue9en_US
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record