Network deconvolution as a general method to distinguish direct dependencies in networks

Soheil Feizi¹, Daniel Marbach, Muriel Médard, Manolis Kellis

Affiliations

PMID: 23851448
PMCID: PMC3773370
DOI: 10.1038/nbt.2635

Network deconvolution as a general method to distinguish direct dependencies in networks

Soheil Feizi et al. Nat Biotechnol. 2013 Aug.

. 2013 Aug;31(8):726-33.

doi: 10.1038/nbt.2635. Epub 2013 Jul 14.

Authors

Soheil Feizi¹, Daniel Marbach, Muriel Médard, Manolis Kellis

Affiliation

¹ Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, USA.

PMID: 23851448
PMCID: PMC3773370
DOI: 10.1038/nbt.2635

Erratum in

Corrigendum: Network deconvolution as a general method to distinguish direct dependencies in networks.
Feizi S, Marbach D, Médard M, Kellis M. Feizi S, et al. Nat Biotechnol. 2015 Apr;33(4):424. doi: 10.1038/nbt0415-424. Nat Biotechnol. 2015. PMID: 25850062 No abstract available.

Abstract

Recognizing direct relationships between variables connected in a network is a pervasive problem in biological, social and information sciences as correlation-based networks contain numerous indirect relationships. Here we present a general method for inferring direct effects from an observed correlation matrix containing both direct and indirect effects. We formulate the problem as the inverse of network convolution, and introduce an algorithm that removes the combined effect of all indirect paths of arbitrary length in a closed-form solution by exploiting eigen-decomposition and infinite-series sums. We demonstrate the effectiveness of our approach in several network applications: distinguishing direct targets in gene expression regulatory networks; recognizing directly interacting amino-acid residues for protein structure prediction from sequence alignments; and distinguishing strong collaborations in co-authorship social networks using connectivity information alone. In addition to its theoretical impact as a foundational graph theoretic tool, our results suggest network deconvolution is widely applicable for computing direct dependencies in network science across diverse disciplines.

PubMed Disclaimer

Figures

**Figure 1. Network deconvolution overview**
a. Direct edges in a network (solid blue arrows) can lead to indirect relationships (dashed red arrows) as a result of transitive information flow. These indirect contributions can be of length two (e.g. 1→2→3), three (e.g. 1→2→3→5) or higher, and can combine both direct and indirect effects (e.g. 2→4), and multiple indirect effects along varying paths (e.g. 2→3→5, 2→4→5). Self-loops are excluded from networks. Network deconvolution seeks to reverse the effect of transitive information flow across all indirect paths, in order to recover the true direct network (blue edges, G_dir) based on the observed network (combined blue and red edges, G_obs). b. Algebraically, the transitive closure of a network can be expressed as an infinite sum of the true direct network and all indirect effects along paths of increasing lengths, which can be written in a closed form as an infinite-series sum. Network deconvolution exploits this closed form to express the direct network G_dir as a function of the observed network G_obs. c. To efficiently compute this inverse operation, we express both the true and observed networks G_dir and G_obs by decomposition into their eigenvectors and eigenvalues, which enables each eigenvalue λ_i ^dir of the original network to be expressed as a nonlinear function of a single corresponding eigenvalue λ_i ^obs of the convolved observed network. **d,e.** Network deconvolution assumes that indirect flow weights can be approximated as the product of direct edge weights and that observed edge weights are the sum of direct and indirect flows. When these assumptions hold **(d)**, network deconvolution removes all indirect flow effects and infers all direct interactions and weights exactly. Even when these assumptions do not hold **(e)**, ND infers 87% of direct edges, showing robustness to non-linear effects.

**Figure 2. Deconvolution of gene regulatory networks**
a. Network Deconvolution (ND) applied to the inferred networks of top-scoring methods from DREAM5 leads to consistent improvements for mutual information (MI) and correlation based methods (average performance increase 59%). ND also improves other top-scoring methods (11% on average), including the best performing method of the DREAM5 challenge (GENIE3), thus leading to a new overall highest performance. Moreover, the community network obtained by integrating network predictions from individual methods (1–9) before ND is outperformed by the community network based on deconvolved networks by ∼22%. b. Network motif analysis showing the relative performance of inference methods for cascades (casc.) and feed-forward loops (FFL) before and after ND. Red/blue corresponds to increased/decreased prediction accuracy of the two motif types relative to the overall performance of the method *before ND* (measured by the area under the ROC curve, AUROC; Supplementary Note S2.4). The original methods (before ND, left side) have different relative performances for cascades and FFLs, e.g., mutual information-based network inference (MI) tends to include feed-forward edges (red arrow), resulting in higher accuracy for FFLs but lower accuracy for cascades, while the opposite is true for the Inferelator and ANOVerence. The deconvolved networks (after ND, right side) show significantly higher accuracy for true cascade network motifs for all methods, and moderately improved accuracy for FFLs on average, showing that ND successfully eliminates spurious indirect feed-forward edges for true cascade motifs, without sacrificing accuracy for true FFLs.

**Figure 3. Application to protein structure prediction**
a. Applying ND to predict experimentally-determined residue contacts (gray dots) based on amino-acid sequence alignments on fifteen proteins in different folding classes with sizes ranging from 50 to 260 residues in human. We applied ND to networks derived by mutual information (MI, lower left triangles) and direct information (DI, upper-right triangles). Arrows highlight distinct residue interactions captured by each method, highlighting the improvement over both MI and DI. b. Cumulative distributions of graph weights for interacting (solid lines) and non-interacting (dashed lines) amino acid pairs, for both MI (blue) and ND (red). Network deconvolution assigns higher weights to true positive edges and lower weights to false negatives, leading to 5-fold higher discrimination between true contacts and indirect mutual information for the 10% of edges with highest scores

**Figure 4. Application to co-authorship social network**
a. Use of network deconvolution to distinguishing strong ties (red) and weak ties (green) in the largest connected component of a co-authorship network containing 379 authors. True collaboration strengths were computed by summing the number of co-authored papers and down-weighting each paper by the number of additional co-authors. ND only had access to unweighted co-authorship edges, but exploiting transitive relationships to weigh down weak ties resulting in 77% accurate predictions (solid lines) and only 23% inaccurate predictions (dashed lines), demonstrating that this information lies within the network edges, and that ND is well-suited for discovering it. b. Beyond the binary classification of strong and weak ties, we found a strong correlation (R²=0.76) across all 2,742 edges connecting 1,589 authors, between the weights assigned by ND (y-axis) and the true collaboration strengths (x-axis) obtained using additional publication details.

See this image and copyright information in PMC

Comment in

Network cleanup.
Alipanahi B, Frey BJ. Alipanahi B, et al. Nat Biotechnol. 2013 Aug;31(8):714-5. doi: 10.1038/nbt.2657. Nat Biotechnol. 2013. PMID: 23929347 No abstract available.

References

1. De Smet R, Marchal K. Advantages and limitations of current network inference methods. Nature reviews Microbiology. 2010;8:717–29. - PubMed
1. Newman MEJ. The structure and function of complex networks. SIAM review. 2003:167–256.
1. Koetter R, Médard M. An algebraic approach to network coding. IEEE/ACM transactions on networking. 2003;11:782–795.
1. Witten IH, Frank E, Hall MA. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann; 2011.
1. Marbach D, et al. Wisdom of crowds for robust gene network inference. Nature Methods. 2012 - PMC - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Network deconvolution as a general method to distinguish direct dependencies in networks

Affiliation

Network deconvolution as a general method to distinguish direct dependencies in networks

Authors

Affiliation

Erratum in

Abstract

Figures

Comment in

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources