Using sequence similarity networks for visualization of relationships across diverse protein superfamilies
- PMID: 19190775
- PMCID: PMC2631154
- DOI: 10.1371/journal.pone.0004345
Using sequence similarity networks for visualization of relationships across diverse protein superfamilies
Abstract
The dramatic increase in heterogeneous types of biological data--in particular, the abundance of new protein sequences--requires fast and user-friendly methods for organizing this information in a way that enables functional inference. The most widely used strategy to link sequence or structure to function, homology-based function prediction, relies on the fundamental assumption that sequence or structural similarity implies functional similarity. New tools that extend this approach are still urgently needed to associate sequence data with biological information in ways that accommodate the real complexity of the problem, while being accessible to experimental as well as computational biologists. To address this, we have examined the application of sequence similarity networks for visualizing functional trends across protein superfamilies from the context of sequence similarity. Using three large groups of homologous proteins of varying types of structural and functional diversity--GPCRs and kinases from humans, and the crotonase superfamily of enzymes--we show that overlaying networks with orthogonal information is a powerful approach for observing functional themes and revealing outliers. In comparison to other primary methods, networks provide both a good representation of group-wise sequence similarity relationships and a strong visual and quantitative correlation with phylogenetic trees, while enabling analysis and visualization of much larger sets of sequences than trees or multiple sequence alignments can easily accommodate. We also define important limitations and caveats in the application of these networks. As a broadly accessible and effective tool for the exploration of protein superfamilies, sequence similarity networks show great potential for generating testable hypotheses about protein structure-function relationships.
Conflict of interest statement
Figures






Similar articles
-
Evolution of function in protein superfamilies, from a structural perspective.J Mol Biol. 2001 Apr 6;307(4):1113-43. doi: 10.1006/jmbi.2001.4513. J Mol Biol. 2001. PMID: 11286560
-
Exploring the sequence, function, and evolutionary space of protein superfamilies using sequence similarity networks and phylogenetic reconstructions.Methods Enzymol. 2019;620:315-347. doi: 10.1016/bs.mie.2019.03.015. Epub 2019 Apr 17. Methods Enzymol. 2019. PMID: 31072492
-
Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence, full structure, and active site microenvironment similarity.Protein Sci. 2015 Sep;24(9):1423-39. doi: 10.1002/pro.2724. Epub 2015 Aug 18. Protein Sci. 2015. PMID: 26073648 Free PMC article.
-
Visualization of multiple alignments, phylogenies and gene family evolution.Nat Methods. 2010 Mar;7(3 Suppl):S16-25. doi: 10.1038/nmeth.1434. Nat Methods. 2010. PMID: 20195253 Review.
-
Phylogenetic characterization of transport protein superfamilies: superiority of SuperfamilyTree programs over those based on multiple alignments.J Mol Microbiol Biotechnol. 2011;21(3-4):83-96. doi: 10.1159/000334611. Epub 2012 Jan 31. J Mol Microbiol Biotechnol. 2011. PMID: 22286036 Free PMC article. Review.
Cited by
-
PqqD is a novel peptide chaperone that forms a ternary complex with the radical S-adenosylmethionine protein PqqE in the pyrroloquinoline quinone biosynthetic pathway.J Biol Chem. 2015 May 15;290(20):12908-18. doi: 10.1074/jbc.M115.646521. Epub 2015 Mar 27. J Biol Chem. 2015. PMID: 25817994 Free PMC article.
-
AlignScape, displaying sequence similarity using self-organizing maps.Front Bioinform. 2024 Jan 26;4:1321508. doi: 10.3389/fbinf.2024.1321508. eCollection 2024. Front Bioinform. 2024. PMID: 38343649 Free PMC article.
-
A prevalent peptide-binding domain guides ribosomal natural product biosynthesis.Nat Chem Biol. 2015 Aug;11(8):564-70. doi: 10.1038/nchembio.1856. Epub 2015 Jul 13. Nat Chem Biol. 2015. PMID: 26167873 Free PMC article.
-
Pclust: protein network visualization highlighting experimental data.Bioinformatics. 2013 Oct 15;29(20):2647-8. doi: 10.1093/bioinformatics/btt451. Epub 2013 Aug 5. Bioinformatics. 2013. PMID: 23918248 Free PMC article.
-
The ins and outs of algal metal transport.Biochim Biophys Acta. 2012 Sep;1823(9):1531-52. doi: 10.1016/j.bbamcr.2012.04.010. Epub 2012 May 1. Biochim Biophys Acta. 2012. PMID: 22569643 Free PMC article. Review.
References
-
- Morris JH, Huang CC, Babbitt PC, Ferrin TE. structureViz: linking Cytoscape and UCSF Chimera. Bioinformatics. 2007;23:2345–2347. - PubMed
-
- Enright AJ, Ouzounis CA. BioLayout–an automatic graph layout algorithm for similarity visualization. Bioinformatics. 2001;17:853–854. - PubMed
-
- Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22:1658–1659. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials