Persistently conserved positions in structurally similar, sequence dissimilar proteins: roles in preserving protein fold and function
- PMID: 11790845
- PMCID: PMC2373454
- DOI: 10.1110/ps.18602
Persistently conserved positions in structurally similar, sequence dissimilar proteins: roles in preserving protein fold and function
Abstract
Many protein pairs that share the same fold do not have any detectable sequence similarity, providing a valuable source of information for studying sequence-structure relationship. In this study, we use a stringent data set of structurally similar, sequence-dissimilar protein pairs to characterize residues that may play a role in the determination of protein structure and/or function. For each protein in the database, we identify amino-acid positions that show residue conservation within both close and distant family members. These positions are termed "persistently conserved". We then proceed to determine the "mutually" persistently conserved (MPC) positions: those structurally aligned positions in a protein pair that are persistently conserved in both pair mates. Because of their intra- and interfamily conservation, these positions are good candidates for determining protein fold and function. We find that 45% of the persistently conserved positions are mutually conserved. A significant fraction of them are located in critical positions for secondary structure determination, they are mostly buried, and many of them form spatial clusters within their protein structures. A substitution matrix based on the subset of MPC positions shows two distinct characteristics: (i) it is different from other available matrices, even those that are derived from structural alignments; (ii) its relative entropy is high, emphasizing the special residue restrictions imposed on these positions. Such a substitution matrix should be valuable for protein design experiments.
Figures







Similar articles
-
Glimmers in the midnight zone: characterization of aligned identical residues in sequence-dissimilar proteins sharing a common fold.Proc Int Conf Intell Syst Mol Biol. 2000;8:162-70. Proc Int Conf Intell Syst Mol Biol. 2000. PMID: 10977077
-
The 1.7 A crystal structure of BPI: a study of how two dissimilar amino acid sequences can adopt the same fold.J Mol Biol. 2000 Jun 16;299(4):1019-34. doi: 10.1006/jmbi.2000.3805. J Mol Biol. 2000. PMID: 10843855
-
An integrated approach to the analysis and modeling of protein sequences and structures. III. A comparative study of sequence conservation in protein structural families using multiple structural alignments.J Mol Biol. 2000 Aug 18;301(3):691-711. doi: 10.1006/jmbi.2000.3975. J Mol Biol. 2000. PMID: 10966778
-
Relationship between sequence conservation and three-dimensional structure in a large family of esterases, lipases, and related proteins.Protein Sci. 1993 Mar;2(3):366-82. doi: 10.1002/pro.5560020309. Protein Sci. 1993. PMID: 8453375 Free PMC article. Review.
-
The family feud: do proteins with similar structures fold via the same pathway?Curr Opin Struct Biol. 2005 Feb;15(1):42-9. doi: 10.1016/j.sbi.2005.01.011. Curr Opin Struct Biol. 2005. PMID: 15718132 Review.
Cited by
-
Atomic interaction networks in the core of protein domains and their native folds.PLoS One. 2010 Feb 23;5(2):e9391. doi: 10.1371/journal.pone.0009391. PLoS One. 2010. PMID: 20186337 Free PMC article.
-
The crystal structure of the regulatory domain of the human sodium-driven chloride/bicarbonate exchanger.Sci Rep. 2017 Sep 21;7(1):12131. doi: 10.1038/s41598-017-12409-0. Sci Rep. 2017. PMID: 28935959 Free PMC article.
-
Probabilistic divergence of a template-based modelling methodology from the ideal protocol.J Mol Model. 2021 Jan 7;27(2):25. doi: 10.1007/s00894-020-04640-w. J Mol Model. 2021. PMID: 33411019
-
Inconsistent distances in substitution matrices can be avoided by properly handling hydrophobic residues.Evol Bioinform Online. 2008 Oct 9;4:255-61. doi: 10.4137/ebo.s885. Evol Bioinform Online. 2008. PMID: 19204823 Free PMC article.
-
Prediction of protein structural features from sequence data based on Shannon entropy and Kolmogorov complexity.PLoS One. 2015 Apr 9;10(4):e0119306. doi: 10.1371/journal.pone.0119306. eCollection 2015. PLoS One. 2015. PMID: 25856073 Free PMC article.
References
-
- Blake, J.D. and Cohen, F.E. 2001. Pairwise sequence alignment below the twilight zone. J. Mol. Biol. 307 721–735. - PubMed
-
- Bowie, J.U., Reidhaar-Olson, J.F., Lim, W.A., and Sauer, R.T. 1990. Deciphering the message in protein sequences: Tolerance to amino acid substitutions. Science 247 1306–1310. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources