Combining evolutionary and structural information for local protein structure prediction
- PMID: 15281130
- DOI: 10.1002/prot.20158
Combining evolutionary and structural information for local protein structure prediction
Abstract
We study the effects of various factors in representing and combining evolutionary and structural information for local protein structural prediction based on fragment selection. We prepare databases of fragments from a set of non-redundant protein domains. For each fragment, evolutionary information is derived from homologous sequences and represented as estimated effective counts and frequencies of amino acids (evolutionary frequencies) at each position. Position-specific amino acid preferences called structural frequencies are derived from statistical analysis of discrete local structural environments in database structures. Our method for local structure prediction is based on ranking and selecting database fragments that are most similar to a target fragment. Using secondary structure type as a local structural property, we test our method in a number of settings. The major findings are: (1) the COMPASS-type scoring function for fragment similarity comparison gives better prediction accuracy than three other tested scoring functions for profile-profile comparison. We show that the COMPASS-type scoring function can be derived both in the probabilistic framework and in the framework of statistical potentials. (2) Using the evolutionary frequencies of database fragments gives better prediction accuracy than using structural frequencies. (3) Finer definition of local environments, such as including more side-chain solvent accessibility classes and considering the backbone conformations of neighboring residues, gives increasingly better prediction accuracy using structural frequencies. (4) Combining evolutionary and structural frequencies of database fragments, either in a linear fashion or using a pseudocount mixture formula, results in improvement of prediction accuracy. Combination at the log-odds score level is not as effective as combination at the frequency level. This suggests that there might be better ways of combining sequence and structural information than the commonly used linear combination of log-odds scores. Our method of fragment selection and frequency combination gives reasonable results of secondary structure prediction tested on 56 CASP5 targets (average SOV score 0.77), suggesting that it is a valid method for local protein structure prediction. Mixture of predicted structural frequencies and evolutionary frequencies improve the quality of local profile-to-profile alignment by COMPASS.
Copyright 2004 Wiley-Liss, Inc.
Similar articles
-
Prediction of protein tertiary structure using PROFESY, a novel method based on fragment assembly and conformational space annealing.Proteins. 2004 Sep 1;56(4):704-14. doi: 10.1002/prot.20150. Proteins. 2004. PMID: 15281124
-
Fragment-based local statistical potentials derived by combining an alphabet of protein local structures with secondary structures and solvent accessibilities.Proteins. 2009 Mar;74(4):820-36. doi: 10.1002/prot.22191. Proteins. 2009. PMID: 18704928
-
Accurate prediction of solvent accessibility using neural networks-based regression.Proteins. 2004 Sep 1;56(4):753-67. doi: 10.1002/prot.20176. Proteins. 2004. PMID: 15281128
-
Contemporary approaches to protein structure classification.Bioessays. 1998 Nov;20(11):884-91. doi: 10.1002/(SICI)1521-1878(199811)20:11<884::AID-BIES3>3.0.CO;2-H. Bioessays. 1998. PMID: 9872054 Review.
-
Prediction of protein-protein interaction based on structure.Methods Mol Biol. 2006;340:207-34. doi: 10.1385/1-59745-116-9:207. Methods Mol Biol. 2006. PMID: 16957339 Review.
Cited by
-
From local structure to a global framework: recognition of protein folds.J R Soc Interface. 2014 Apr 16;11(95):20131147. doi: 10.1098/rsif.2013.1147. Print 2014 Jun 6. J R Soc Interface. 2014. PMID: 24740960 Free PMC article. Review.
-
MUMMALS: multiple sequence alignment improved by using hidden Markov models with local structural information.Nucleic Acids Res. 2006;34(16):4364-74. doi: 10.1093/nar/gkl514. Epub 2006 Aug 26. Nucleic Acids Res. 2006. PMID: 16936316 Free PMC article.
-
Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning.Sci Rep. 2015 Jun 22;5:11476. doi: 10.1038/srep11476. Sci Rep. 2015. PMID: 26098304 Free PMC article.
-
In Silico Genetics Revealing 5 Mutations in CEBPA Gene Associated With Acute Myeloid Leukemia.Cancer Inform. 2019 Aug 19;18:1176935119870817. doi: 10.1177/1176935119870817. eCollection 2019. Cancer Inform. 2019. PMID: 31621694 Free PMC article.
-
New assessment of a structural alphabet.In Silico Biol. 2005;5(3):283-9. Epub 2005 Mar 16. In Silico Biol. 2005. PMID: 15996119 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources