Stability of building gene regulatory networks with sparse autoregressive models

Rajapakse, Jagath C; Mundra, Piyushkumar A

Author(s)

Rajapakse, Jagath; Mundra, Piyushkumar A.

Download1471-2105-12-S13-S17.pdf (387.0Kb)

PUBLISHER_CC

Terms of use

Creative Commons Attribution http://creativecommons.org/licenses/by/2.0

Metadata

Show full item record

Abstract

Background: Biological networks are constantly subjected to random perturbations, and efficient feedback and compensatory mechanisms exist to maintain their stability. There is an increased interest in building gene regulatory networks (GRNs) from temporal gene expression data because of their numerous applications in life sciences. However, because of the limited number of time points at which gene expressions can be gathered in practice, computational techniques of building GRN often lead to inaccuracies and instabilities. This paper investigates the stability of sparse auto-regressive models of building GRN from gene expression data. Results: Criteria for evaluating the stability of estimating GRN structure are proposed. Thereby, stability of multivariate vector autoregressive (MVAR) methods - ridge, lasso, and elastic-net - of building GRN were studied by simulating temporal gene expression datasets on scale-free topologies as well as on real data gathered over Hela cell-cycle. Effects of the number of time points on the stability of constructing GRN are investigated. When the number of time points are relatively low compared to the size of network, both accuracy and stability are adversely affected. At least, the number of time points equal to the number of genes in the network are needed to achieve decent accuracy and stability of the networks. Our results on synthetic data indicate that the stability of lasso and elastic-net MVAR methods are comparable, and their accuracies are much higher than the ridge MVAR. As the size of the network grows, the number of time points required to achieve acceptable accuracy and stability are much less relative to the number of genes in the network. The effects of false negatives are easier to improve by increasing the number time points than those due to false positives. Application to HeLa cell-cycle gene expression dataset shows that biologically stable GRN can be obtained by introducing perturbations to the data. Conclusions: Accuracy and stability of building GRN are crucial for investigation of gene regulations. Sparse MVAR techniques such as lasso and elastic-net provide accurate and stable methods for building even GRN of small size. The effect of false negatives is corrected much easier with the increased number of time points than those due to false positives. With real data, we demonstrate how stable networks can be derived by introducing random perturbation to data.

Description

This article has been published as part of BMC Bioinformatics Volume 12 Supplement 13, 2011: Tenth International Conference on Bioinformatics – First ISCB Asia Joint Conference 2011 (InCoB/ISCB-Asia 2011): Bioinformatics. The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2105/12?issue=S13.

Date issued

2011-11

URI

http://hdl.handle.net/1721.1/70529

Department

Massachusetts Institute of Technology. Department of Biological Engineering; Singapore-MIT Alliance in Research and Technology (SMART)

Journal

BMC Bioinformatics

Publisher

BioMed Central Ltd

Citation

BMC Bioinformatics. 2011 Nov 30;12(Suppl 13):S17

Version: Final published version

ISSN

1471-2105

Collections

MIT Open Access Articles

DSpace@MIT