Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Aug;31(8):720-5.
doi: 10.1038/nbt.2601. Epub 2013 Jul 14.

Network link prediction by global silencing of indirect correlations

Affiliations

Network link prediction by global silencing of indirect correlations

Baruch Barzel et al. Nat Biotechnol. 2013 Aug.

Abstract

Predictions of physical and functional links between cellular components are often based on correlations between experimental measurements, such as gene expression. However, correlations are affected by both direct and indirect paths, confounding our ability to identify true pairwise interactions. Here we exploit the fundamental properties of dynamical correlations in networks to develop a method to silence indirect effects. The method receives as input the observed correlations between node pairs and uses a matrix transformation to turn the correlation matrix into a highly discriminative silenced matrix, which enhances only the terms associated with direct causal links. Against empirical data for Escherichia coli regulatory interactions, the method enhanced the discriminative power of the correlations by twofold, yielding >50% predictive improvement over traditional correlation measures and 6% over mutual information. Overall this silencing method will help translate the abundant correlation data into insights about a system's interactions, with applications ranging from link prediction to inferring the dynamical mechanisms governing biological networks.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Silencing indirect links
(a) The experimentally observed global response matrix, Gij, accounts for direct as well as indirect correlations, with no clear separation between them. The source of Gij could be gene coexpression data, statistical correlations or genetic perturbation experiments. (b) In the absence of a clear separation in Gij assigned to direct and indirect correlations, our ability to infer direct physical links (solid lines) is limited. Simple thresholding, i.e. accepting all links for which Gij exceeds a predefined threshold, is known to predict spurious links (strong dashed lines) and overlook true links (light solid lines). (c) While the average Gij terms associated with direct links (dark blue) are higher than the average terms associated with indirect links (light blue), as captured by the discrimination ratio, ΔG, the difference is not sufficient to identify direct and indirect links. (d) Silencing is achieved through Eq. (5), which exploits the flow of information in the network: the flow from the source (j) to the target (i) is carried through the indirect effect Gkj (orange) coupled with the direct impact Sik of the target's nearest neighbor κ (blue). By silencing the indirect contributions, Eq. (5) provides the local response matrix, Sij , whose non-zero elements correspond to direct links. (e) – (f) In Sij the terms associated with indirect links are silenced, allowing us the detect only the direct links of the underlying network. (g) As indirect terms become much smaller in Sij, we obtain a greater discrimination ratio, ΔS. The degree of silencing, κ, captures the increase observed in the discrimination ratio by the transition from Gij to Sij (5).
Figure 2
Figure 2. Network inference in model systems
We numerically simulated Michaelis-Menten dynamics on a scale-free network [40-42], extracting the correlations Gij between all pairs of nodes (see Sec. S.III for details). (a) Gij and Sij associated with interacting (green) and non-interacting (orange) node pairs. Sij silences the correlations associated with indirect interactions, resulting in a clear separation between direct and indirect interactions, a phenomenon absent from Gij. (b) ROC curve obtained from Gij (red, area 0.91) and Sij (blue, area 0.997). The Sij network reaches 100% accuracy with a negligible amount of false positives. (c) Precision obtained for threshold q for Gij (red) and Sij (blue). The gradual rise of the Gij-based precision indicates that for a broad range of thresholds only a small fraction of the links will be identified. In contrast, the steep rise in precision for Sij indicates its enhanced discriminative power between direct and indirect links: virtually any non-zero Sij corresponds to a directly interacting pair. (d) The discrimination ratio, Δ, is much higher in Sij (blue) compared to Gij (red). This indicates that Sij is a much better predictor of direct vs. indirect interactions. The silencing (5), which captures the increase in the discrimination ratio is κ = 15.0. (e) Silencing increases with the path length dij between i and j, so that the more indirect is the link the more dramatic is the silencing. (f) The source of Sij's success is the silencing effect, here illustrated on correlations measured for a linear cascade. The reconstruction of the cascade from Gij is confounded by numerous non-vanishing indirect correlations. In Sij the indirect correlations are silenced, providing a perfect reconstruction.
Figure 3
Figure 3. Inferring regulatory interactions in E. coli
(a) Starting from gene expression data, we used Pearson correlations in expression patterns to construct Gij for 4,511 E. coli genes, obtaining Sij via (4). We compared our predictions to a gold standard of experimentally verified genetic regulatory links [19]. The area under the ROC curve (AUROC) is increased from 0.59 to 0.64 in the transition from Gij to Sij, representing a 56% improvement (above the baseline of 0.5 for a random guess). (b) An improvement of 67% is observed for Spearman rank correlations. (c) A less dramatic improvement of 6% is shown when Gij is constructed using mutual information. (d) The discrimination ratio for all three methods compared with that obtained from the pertinent Sij matrix. The transition to Sij (4) increases the discrimination between direct and indirect interactions by a factor of two or more, so that indirect interactions have a significantly lower expression in Sij. (e) - (f) This observation becomes even more dramatic when focusing on two specific motifs: cascades and co-regulators. In Gij the indirect correlation between X and Y, which is induced by the intermediate node, I, may lead to the false prediction of the spurious X – Y link. Thanks to silencing, the discrimination between the direct and indirect links in these motifs is increased by a factor of three or more for Pearson and Spearman correlations, and by a factor of about two for mutual information.
Figure 4
Figure 4. Silencing in a noisy environment
To test the method's performance in the presence of a noisy input we added Gaussian noise to the numerically obtained Gij, and measured the silencing, κ, vs. the signal to noise ratio θ. For low noise levels (θ ≲ 0.1) silencing is relatively unharmed. At higher noise level silencing decreases as κ ~ θ–1, a slow decay that supports the robustness of the method. Silencing is lost at θC ≈ 0.75, when the signal is almost fully driven by the noise.
Figure 5
Figure 5. Performance with hidden nodes
(a) A network with N = 8 nodes of which a fraction η = 1/4 are hidden. The observable sub-network has six nodes, five forming a connected component (with 10 connected node pairs) and one isolated (6 isolated pairs). The ratio between isolated and connected node pairs here is ρ = 6/10. Equation (5), applied to the observable network, successfully silences the indirect Gij terms among the nodes of the connected component. However the correlations between the isolated node and the rest of the network, lacking an indirect path, are not silenced. (b) To test the silencing in the presence of hidden nodes we used the numerically obtained Gij (Fig. 2) from which we eliminated a fraction η of the nodes, obtaining an observable network with 104 isolated node pairs (ρ ≈ 103). After applying Eq. (5) to the remaining nodes we find that the silencing of Gij terms associated with connected node pairs is unaffected (orange bar), while for the isolated node pairs silencing drops to κ = 1, namely no silencing (purple bar). Hence for the isolated node pairs Sij is not more predictive than Gij. (c) Increasing the fraction of hidden nodes, η (top horizontal axis), we measured κ vs. ρ. As expected, silencing is observed as long as most node pairs are connected via finite paths (ρ < 1). However, when the number of hidden nodes is increased to the point that the isolated pairs dominate (ρ > 1), silencing is no longer observed (κ = 1). The critical fraction of hidden nodes, ηC, corresponds to ρ = 1, the point where silencing no longer plays a significant role. Here we find ηC ≈ 0.57 (blue arrow), in agreement with the prediction of Eq. (7).

Comment in

  • Network cleanup.
    Alipanahi B, Frey BJ. Alipanahi B, et al. Nat Biotechnol. 2013 Aug;31(8):714-5. doi: 10.1038/nbt.2657. Nat Biotechnol. 2013. PMID: 23929347 No abstract available.
  • Silence on the relevant literature and errors in implementation.
    Bastiaens P, Birtwistle MR, Blüthgen N, Bruggeman FJ, Cho KH, Cosentino C, de la Fuente A, Hoek JB, Kiyatkin A, Klamt S, Kolch W, Legewie S, Mendes P, Naka T, Santra T, Sontag E, Westerhoff HV, Kholodenko BN. Bastiaens P, et al. Nat Biotechnol. 2015 Apr;33(4):336-9. doi: 10.1038/nbt.3185. Nat Biotechnol. 2015. PMID: 25850052 No abstract available.
  • Response to letter of correspondence - Bastiaens et al.
    Barzel B, Barabási AL. Barzel B, et al. Nat Biotechnol. 2015 Apr;33(4):339-42. doi: 10.1038/nbt.3184. Nat Biotechnol. 2015. PMID: 25850053 No abstract available.

References

    1. Vendruscolo M. In: Networks in Cell Biology. Buchanan M, Caldarelli G, De Los Rios P, Rao F, editors. Cambridge University Press; 2010.
    1. Ideker T, Sharan R. Protein networks in disease. Genome Res. 2008;18:644–652. - PMC - PubMed
    1. Kann MG. Protein interactions and disease: Computational approaches to uncover the etiology of diseases. Briefings in Bioinformatics. 2007;8:333–346. - PubMed
    1. Albert R. Scale-free networks in cell biology. Journal of Cell Science. 2005;118:4947–57. - PubMed
    1. Barabási A-L, Oltvai ZN. Network biology: understanding the cell's functional organization. Nature Reviews Genetics. 2004;5:101–113. - PubMed

Publication types