. 2013 Jul 12;8(7):e67995.

doi: 10.1371/journal.pone.0067995. Print 2013.

SMETANA: accurate and scalable algorithm for probabilistic alignment of large-scale biological networks

Sayed Mohammad Ebrahim Sahraeian¹, Byung-Jun Yoon

Affiliations

PMID: 23874484
PMCID: PMC3710069
DOI: 10.1371/journal.pone.0067995

SMETANA: accurate and scalable algorithm for probabilistic alignment of large-scale biological networks

Sayed Mohammad Ebrahim Sahraeian et al. PLoS One. 2013.

. 2013 Jul 12;8(7):e67995.

doi: 10.1371/journal.pone.0067995. Print 2013.

Authors

Sayed Mohammad Ebrahim Sahraeian¹, Byung-Jun Yoon

Affiliation

¹ Department of Plant and Microbial Biology, University of California, Berkely, California, USA.

PMID: 23874484
PMCID: PMC3710069
DOI: 10.1371/journal.pone.0067995

Abstract

In this paper we introduce an efficient algorithm for alignment of multiple large-scale biological networks. In this scheme, we first compute a probabilistic similarity measure between nodes that belong to different networks using a semi-Markov random walk model. The estimated probabilities are further enhanced by incorporating the local and the cross-species network similarity information through the use of two different types of probabilistic consistency transformations. The transformed alignment probabilities are used to predict the alignment of multiple networks based on a greedy approach. We demonstrate that the proposed algorithm, called SMETANA, outperforms many state-of-the-art network alignment techniques, in terms of computational efficiency, alignment accuracy, and scalability. Our experiments show that SMETANA can easily align tens of genome-scale networks with thousands of nodes on a personal computer without any difficulty. The source code of SMETANA is available upon request. The source code of SMETANA can be downloaded from http://www.ece.tamu.edu/~bjyoon/SMETANA/.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

**Figure 1. Performance of various network alignment algorithms.**
(A) Equivalence class coverage: Number of equivalence classes in the 5-way alignment experiment that contain nodes from species (). (B) Node Coverage: Number of nodes (i.e. proteins) that belong to equivalence classes that contain nodes from species in the 5-way alignment.

formula image — **Figure 1. Performance of various network alignment algorithms.**
(A) Equivalence class coverage: Number of equivalence classes in the 5-way alignment experiment that contain nodes from species (). (B) Node Coverage: Number of nodes (i.e. proteins) that belong to equivalence classes that contain nodes from species in the 5-way alignment.

**Figure 2. Performance of various network alignment algorithms.**
(A) Equivalence class coverage: Number of equivalence classes in the 8-way alignment experiment that contain nodes from species (). (B) Node Coverage: Number of nodes (i.e. proteins) that belong to equivalence classes that contain nodes from species in the 8-way alignment.

**Figure 3. Number of conserved interactions (CI) and conserved orthologous interactions (COI) for different alignments.**
CI reports the total number of conserved edges between any of the equivalence classes in the alignment. COI reports the total number of conserved edges between “correct” equivalence classes that consist of orthologous nodes.

**Figure 4. Effects of node similarity on the performance of different network alignment algorithms.**
The alignment performance has been estimated at several different levels of separation between the similarity score distribution for orthologous node pairs and that for non-orthologs pairs. Increasing the bias increases the separation between the two score distributions, which increases the discriminative power of the node similarity score for predicting potential orthologs. Measures reported: specificity (SPE), number of correct nodes (CN) (which reflects the sensitivity), mean normalized entropy (MNE), number of conserved interactions (CI), number of conserved orthologous interactions (COI).

**Figure 5. Total CPU time for aligning the networks.**
The total CPU time for the pairwise, 5-way, and 8-way alignments. CPU time has been averaged over DMC, DMR, and CG datasets (measured in seconds).

**Figure 6. Performance on real networks.**
The trend of change in sensitivity (SEN) and specificity (SP) as the number of networks in the alignment increases for different multiple network alignment algorithms.

**Figure 7. Conserved subnetwork regions in the 3-way alignment of *D. melanogaster*, *H. sapiens*, and *S. cerevisiae* using the proposed method.**
(A) Transcription factor. (B) Replication factor C. (C) RNA polymerase. (D) DNA replication. (Aligned nodes are placed in the same row of alignment and connected with yellow dashed lines. In each network, the interaction in the IsoBase dataset is shown in solid lines. For *D. melanogaster* some edges which are missed in IsoBase but are present in STRING protein interaction database is shown in dotted gray lines).

See this image and copyright information in PMC

References

1. Zhang A (2009) Protein Interaction Networks: Computational Analysis. New York, NY, USA: Cambridge University Press, 1st edition.
1. Barabasi AL, Oltvai ZN (2004) Network biology: understanding the cell's functional organization. Nat Rev Genet 5: 101–113. - PubMed
1. Cusick ME, Klitgord N, Vidal M, Hill DE (2005) Interactome: gateway into systems biology. Hum Mol Genet 14 Spec No. 2: R171–181. - PubMed
1. Uetz P, Giot L, Cagney G, Mansfield TA, Judson RS, et al. (2000) A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403: 623–627. - PubMed
1. Ho Y, Gruhler A, Heilbut A, Bader GD, Moore L, et al. (2002) Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415: 180–183. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

SMETANA: accurate and scalable algorithm for probabilistic alignment of large-scale biological networks

Affiliation

SMETANA: accurate and scalable algorithm for probabilistic alignment of large-scale biological networks

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources