Protein space: a natural method for realizing the nature of protein universe
- PMID: 23154188
- DOI: 10.1016/j.jtbi.2012.11.005
Protein space: a natural method for realizing the nature of protein universe
Abstract
Current methods cannot tell us what the nature of the protein universe is concretely. They are based on different models of amino acid substitution and multiple sequence alignment which is an NP-hard problem and requires manual intervention. Protein structural analysis also gives a direction for mapping the protein universe. Unfortunately, now only a minuscule fraction of proteins' 3-dimensional structures are known. Furthermore, the phylogenetic tree representations are not unique for any existing tree construction methods. Here we develop a novel method to realize the nature of protein universe. We show the protein universe can be realized as a protein space in 60-dimensional Euclidean space using a distance based on a normalized distribution of amino acids. Every protein is in one-to-one correspondence with a point in protein space, where proteins with similar properties stay close together. Thus the distance between two points in protein space represents the biological distance of the corresponding two proteins. We also propose a natural graphical representation for inferring phylogenies. The representation is natural and unique based on the biological distances of proteins in protein space. This will solve the fundamental question of how proteins are distributed in the protein universe.
Copyright © 2012 Elsevier Ltd. All rights reserved.
Similar articles
-
An alignment-free method to find similarity among protein sequences via the general form of Chou's pseudo amino acid composition.SAR QSAR Environ Res. 2013;24(7):597-609. doi: 10.1080/1062936X.2013.773378. Epub 2013 May 28. SAR QSAR Environ Res. 2013. PMID: 23710804
-
Representation of proteins as walks in 20-D space.SAR QSAR Environ Res. 2008 Apr-Jun;19(3-4):317-37. doi: 10.1080/10629360802085066. SAR QSAR Environ Res. 2008. PMID: 18484501
-
On the quality of tree-based protein classification.Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12. Bioinformatics. 2005. PMID: 15647305
-
Protein structure similarity clustering and natural product structure as guiding principles in drug discovery.Drug Discov Today. 2005 Apr 1;10(7):471-83. doi: 10.1016/S1359-6446(05)03419-7. Drug Discov Today. 2005. PMID: 15809193 Review.
-
A protein map and its application.DNA Cell Biol. 2008 May;27(5):241-50. doi: 10.1089/dna.2007.0676. DNA Cell Biol. 2008. PMID: 18348704 Review.
Cited by
-
Establishing the phylogeny of Prochlorococcus with a new alignment-free method.Ecol Evol. 2017 Nov 15;7(24):11057-11065. doi: 10.1002/ece3.3535. eCollection 2017 Dec. Ecol Evol. 2017. PMID: 29299281 Free PMC article.
-
An information-based network approach for protein classification.PLoS One. 2017 Mar 28;12(3):e0174386. doi: 10.1371/journal.pone.0174386. eCollection 2017. PLoS One. 2017. PMID: 28350835 Free PMC article.
-
Geometric construction of viral genome space and its applications.Comput Struct Biotechnol J. 2021 Jul 27;19:4226-4234. doi: 10.1016/j.csbj.2021.07.028. eCollection 2021. Comput Struct Biotechnol J. 2021. PMID: 34429843 Free PMC article.
-
A review of visualisations of protein fold networks and their relationship with sequence and function.Biol Rev Camb Philos Soc. 2023 Feb;98(1):243-262. doi: 10.1111/brv.12905. Epub 2022 Oct 9. Biol Rev Camb Philos Soc. 2023. PMID: 36210328 Free PMC article. Review.
-
Computational Prediction and Analysis of Associations between Small Molecules and Binding-Associated S-Nitrosylation Sites.Molecules. 2018 Apr 19;23(4):954. doi: 10.3390/molecules23040954. Molecules. 2018. PMID: 29671802 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous