Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015;16 Suppl 7(Suppl 7):S7.
doi: 10.1186/1471-2105-16-S7-S7. Epub 2015 Apr 23.

A novel procedure for statistical inference and verification of gene regulatory subnetwork

A novel procedure for statistical inference and verification of gene regulatory subnetwork

Haijun Gong et al. BMC Bioinformatics. 2015.

Abstract

Background: The reconstruction of gene regulatory network from time course microarray data can help us comprehensively understand the biological system and discover the pathogenesis of cancer and other diseases. But how to correctly and efficiently decifer the gene regulatory network from high-throughput gene expression data is a big challenge due to the relatively small amount of observations and curse of dimensionality. Computational biologists have developed many statistical inference and machine learning algorithms to analyze the microarray data. In the previous studies, the correctness of an inferred regulatory network is manually checked through comparing with public database or an existing model.

Results: In this work, we present a novel procedure to automatically infer and verify gene regulatory networks from time series expression data. The dynamic Bayesian network, a statistical inference algorithm, is at first implemented to infer an optimal network from time series microarray data of S. cerevisiae, then, a weighted symbolic model checker is applied to automatically verify or falsify the inferred network through checking some desired temporal logic formulas abstracted from experiments or public database.

Conclusions: Our studies show that the marriage of statistical inference algorithm with model checking technique provides a more efficient way to automatically infer and verify the gene regulatory network from time series expression data than previous studies.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Illustration of gene regulatory network (A) and dynamic Bayesian network (B). The gene regulatory network is composed of a feedback loop. Arrows represent activation, and circlehead arrows denote inhibition. The random variable Xij represents a gene j measured at time i.
Figure 2
Figure 2
Pseudocode of gene regulatory network inference and formal verification. Part I describes the dynamic Bayesian network inference method implemeted by Banjo; part II describes the formal verification implemented by weighted symbolic model checker.
Figure 3
Figure 3
Flowchart of gene regulatory network inference from time series microarray data and formal verification. The dynamic Bayesian network inference (A1-A4) is implemented by Banjo, and the inferred network's verification (B1-B3) is implemented by the weighted symbolic model verifier (SMV).
Figure 4
Figure 4
Illustration of weighted symbolic model checking of the regulatory network in Fig. 1. The state transition update is dependent on the modified influence score (weight wi) calculated by Banjo.
Figure 5
Figure 5
An optimal subnetwork of MAPK pathway inferred by Banjo. The optimal network is inferred based on i2 interval discretization method. The directed and circlehead arrows represent activation and inhibition respectively, the value on each edge is influence score or weight describing the interaction between two nodes.
Figure 6
Figure 6
Two optimal subnetworks of cell cycle inferred by Banjo. (A) and (B) are inferred optimal networks based on the i2 interval discretization and q2 quantile discretization methods respectively.

Similar articles

Cited by

References

    1. Statnikov A, Aliferis CF, Tsamardinos I, Hardin D, Levy S. A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis. Bioinformatics. 2005;21:631–643. doi: 10.1093/bioinformatics/bti033. - DOI - PubMed
    1. Luan Y, Li H. Group additive regression models for genomic data analysis. Biostatistics. 2008;9:100–113. doi: 10.1093/biostatistics/kxm015. - DOI - PubMed
    1. Wu TT, Chen YF, Hastie T, Sobel E, Lange K. Genomewide association analysis by lasso penalized logistic regression. Bioinformatics. 2009;25:714–721. doi: 10.1093/bioinformatics/btp041. - DOI - PMC - PubMed
    1. Ma S, Song X, Huang J. Supervised group lasso with applications to microarray data analysis. BMC Bioinformatics. 2007;8:60–76. doi: 10.1186/1471-2105-8-60. - DOI - PMC - PubMed
    1. Wu TT, Wang S. Doubly regularized cox regression for high-dimensional survival data with group structures. Statisstics and Its Interface. 2013;6:175–186. doi: 10.4310/SII.2013.v6.n2.a2. - DOI

LinkOut - more resources