Show simple item record

dc.contributor.authorChindelevitch, Leonid
dc.contributor.authorLoh, Po-Ru
dc.contributor.authorEnayetallah, Ahmed
dc.contributor.authorBerger, Bonnie
dc.contributor.authorZiemek, Daniel
dc.date.accessioned2012-04-09T13:21:27Z
dc.date.available2012-04-09T13:21:27Z
dc.date.issued2012-02
dc.date.submitted2011-07
dc.identifier.issn1471-2105
dc.identifier.urihttp://hdl.handle.net/1721.1/69970
dc.description.abstractBackground: Causal graphs are an increasingly popular tool for the analysis of biological datasets. In particular, signed causal graphs--directed graphs whose edges additionally have a sign denoting upregulation or downregulation--can be used to model regulatory networks within a cell. Such models allow prediction of downstream effects of regulation of biological entities; conversely, they also enable inference of causative agents behind observed expression changes. However, due to their complex nature, signed causal graph models present special challenges with respect to assessing statistical significance. In this paper we frame and solve two fundamental computational problems that arise in practice when computing appropriate null distributions for hypothesis testing. Results: First, we show how to compute a p-value for agreement between observed and model-predicted classifications of gene transcripts as upregulated, downregulated, or neither. Specifically, how likely are the classifications to agree to the same extent under the null distribution of the observed classification being randomized? This problem, which we call "Ternary Dot Product Distribution" owing to its mathematical form, can be viewed as a generalization of Fisher's exact test to ternary variables. We present two computationally efficient algorithms for computing the Ternary Dot Product Distribution and investigate its combinatorial structure analytically and numerically to establish computational complexity bounds. Second, we develop an algorithm for efficiently performing random sampling of causal graphs. This enables p-value computation under a different, equally important null distribution obtained by randomizing the graph topology but keeping fixed its basic structure: connectedness and the positive and negative in- and out-degrees of each vertex. We provide an algorithm for sampling a graph from this distribution uniformly at random. We also highlight theoretical challenges unique to signed causal graphs; previous work on graph randomization has studied undirected graphs and directed but unsigned graphs. Conclusion: We present algorithmic solutions to two statistical significance questions necessary to apply the causal graph methodology, a powerful tool for biological network analysis. The algorithms we present are both fast and provably correct. Our work may be of independent interest in non-biological contexts as well, as it generalizes mathematical results that have been studied extensively in other fields.en_US
dc.description.sponsorshipNational Science Foundation (U.S.) (Graduate Research Fellowship)en_US
dc.description.sponsorshipNational Institutes of Health (U.S.) (grant GM081871)en_US
dc.publisherBioMed Central Ltden_US
dc.relation.isversionofhttp://dx.doi.org/10.1186/1471-2105-13-35en_US
dc.rightsCreative Commons Attributionen_US
dc.rights.urihttp://creativecommons.org/licenses/by/2.0en_US
dc.sourceBioMed Central Ltden_US
dc.titleAssessing statistical significance in causal graphsen_US
dc.typeArticleen_US
dc.identifier.citationChindelevitch, Leonid et al. “Assessing statistical significance in causal graphs.” BMC Bioinformatics 13.1 (2012): 35.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Mathematicsen_US
dc.contributor.mitauthorPo-Ru, Loh
dc.contributor.mitauthorBerger, Bonnie
dc.relation.journalBMC Bioinformaticsen_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2012-03-19T12:10:36Z
dc.language.rfc3066en
dc.rights.holderChindelevitch et al.; licensee BioMed Central Ltd.
dspace.orderedauthorsChindelevitch, Leonid; Loh, Po-Ru; Enayetallah, Ahmed; Berger, Bonnie; Ziemek, Danielen
dc.identifier.orcidhttps://orcid.org/0000-0002-2724-7228
dspace.mitauthor.errortrue
mit.licensePUBLISHER_CCen_US
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record