Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jan 7;41(1):e24.
doi: 10.1093/nar/gks904. Epub 2012 Oct 4.

Reverse engineering and analysis of large genome-scale gene networks

Affiliations

Reverse engineering and analysis of large genome-scale gene networks

Maneesha Aluru et al. Nucleic Acids Res. .

Abstract

Reverse engineering the whole-genome networks of complex multicellular organisms continues to remain a challenge. While simpler models easily scale to large number of genes and gene expression datasets, more accurate models are compute intensive limiting their scale of applicability. To enable fast and accurate reconstruction of large networks, we developed Tool for Inferring Network of Genes (TINGe), a parallel mutual information (MI)-based program. The novel features of our approach include: (i) B-spline-based formulation for linear-time computation of MI, (ii) a novel algorithm for direct permutation testing and (iii) development of parallel algorithms to reduce run-time and facilitate construction of large networks. We assess the quality of our method by comparison with ARACNe (Algorithm for the Reconstruction of Accurate Cellular Networks) and GeneNet and demonstrate its unique capability by reverse engineering the whole-genome network of Arabidopsis thaliana from 3137 Affymetrix ATH1 GeneChips in just 9 min on a 1024-core cluster. We further report on the development of a new software Gene Network Analyzer (GeNA) for extracting context-specific subnetworks from a given set of seed genes. Using TINGe and GeNA, we performed analysis of 241 Arabidopsis AraCyc 8.0 pathways, and the results are made available through the web.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The network to be inferred is represented as an formula image adjacency matrix D, where n is the number of genes. The matrix is partitioned into formula image blocks of submatrices as shown. Each processor is assigned a row of submatrices. The number inside a submatrix indicates the stage at which the submatrix is computed. Only half the matrix is computed as it is symmetric.
Figure 2.
Figure 2.
Comparison of TINGe with ARACNe and GeneNet on synthetic data using SynTReN.
Figure 3.
Figure 3.
Scalability of TINGe for datasets with different numbers of genes n, and different numbers of expression observations m.
Figure 4.
Figure 4.
A partial rendering of the Arabidopsis whole-genome network. The illustrated network represents a union of all the shortest paths between each pair of the top 5% of the hubs in the whole-genome network (Supplementary Table S4). It contains 1556 genes and 22 073 interactions. The network topology is displayed using Cytoscape with the size of a node proportional to its degree and the intensity of its color proportional to its betweenness centrality. The largest and the darkest node in the bottom right-hand corner of the figure is PMDH2, a gene involved in photorespiration.
Figure 5.
Figure 5.
Scale-free nature of the Arabidopsis whole-genome network. Node degree distribution in the network. X-axis is the node degree and Y-axis represents the probability of a node with a given degree.
Figure 6.
Figure 6.
Cellulose subnetwork. Red—seed genes; green—genes sharing the same GO category as the seed genes; blue—genes with associated functions; pink—genes of interacting pathways and yellow—unclassified genes.
Figure 7.
Figure 7.
Carotenoid subnetwork. Color coding as given in Figure 6.
Figure 8.
Figure 8.
Aerobic respiration subnetwork. Color coding as given in Figure 6.

References

    1. Nayak R, Kearns M, Spielman R. Coexpression network based on natural variation in human gene expression reveals gene interactions and functions. Genome Res. 2009;19:1953–1962. - PMC - PubMed
    1. Mao L, van Hemert J, Dash S, Dickerson J. Arabidopsis gene co-expression network and its functional modules. BMC Bioinformatics. 2009;10:346. - PMC - PubMed
    1. Ma S, Gong Q, Bohnert H. An Arabidopsis gene network based on the graphical Gaussian model. Genome Res. 2007;17:1614–1625. - PMC - PubMed
    1. Schafer J, Strimmer K. An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics. 2005;21:754–764. - PubMed
    1. Wille A, Zimmermann P, Vranova E, Furholz A. Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana. Genome Biol. 2004;5:R92. - PMC - PubMed

Publication types