Novel nonlinear knowledge-based mean force potentials based on machine learning
- PMID: 20820079
- DOI: 10.1109/TCBB.2010.86
Novel nonlinear knowledge-based mean force potentials based on machine learning
Abstract
The prediction of 3D structures of proteins from amino acid sequences is one of the most challenging problems in molecular biology. An essential task for solving this problem with coarse-grained models is to deduce effective interaction potentials. The development and evaluation of new energy functions is critical to accurately modeling the properties of biological macromolecules. Knowledge-based mean force potentials are derived from statistical analysis of proteins of known structures. Current knowledge-based potentials are almost in the form of weighted linear sum of interaction pairs. In this study, a class of novel nonlinear knowledge-based mean force potentials is presented. The potential parameters are obtained by nonlinear classifiers, instead of relative frequencies of interaction pairs against a reference state or linear classifiers. The support vector machine is used to derive the potential parameters on data sets that contain both native structures and decoy structures. Five knowledge-based mean force Boltzmann-based or linear potentials are introduced and their corresponding nonlinear potentials are implemented. They are the DIH potential (single-body residue-level Boltzmann-based potential), the DFIRE-SCM potential (two-body residue-level Boltzmann-based potential), the FS potential (two-body atom-level Boltzmann-based potential), the HR potential (two-body residue-level linear potential), and the T32S3 potential (two-body atom-level linear potential). Experiments are performed on well-established decoy sets, including the LKF data set, the CASP7 data set, and the Decoys “R”Us data set. The evaluation metrics include the energy Z score and the ability of each potential to discriminate native structures from a set of decoy structures. Experimental results show that all nonlinear potentials significantly outperform the corresponding Boltzmann-based or linear potentials, and the proposed discriminative framework is effective in developing knowledge-based mean force potentials. The nonlinear potentials can be widely used for ab initio protein structure prediction, model quality assessment, protein docking, and other challenging problems in computational biology.
Similar articles
-
An accurate, residue-level, pair potential of mean force for folding and binding based on the distance-scaled, ideal-gas reference state.Protein Sci. 2004 Feb;13(2):400-11. doi: 10.1110/ps.03348304. Protein Sci. 2004. PMID: 14739325 Free PMC article.
-
Novel knowledge-based mean force potential at the profile level.BMC Bioinformatics. 2006 Jun 27;7:324. doi: 10.1186/1471-2105-7-324. BMC Bioinformatics. 2006. PMID: 16803615 Free PMC article.
-
How well can we predict native contacts in proteins based on decoy structures and their energies?Proteins. 2003 Sep 1;52(4):598-608. doi: 10.1002/prot.10444. Proteins. 2003. PMID: 12910459
-
Review and Comparative Analysis of Methods and Advancements in Predicting Protein Complex Structure.Interdiscip Sci. 2024 Jun;16(2):261-288. doi: 10.1007/s12539-024-00626-x. Epub 2024 Jul 2. Interdiscip Sci. 2024. PMID: 38955920 Review.
-
Recent Progress in Machine Learning-Based Methods for Protein Fold Recognition.Int J Mol Sci. 2016 Dec 16;17(12):2118. doi: 10.3390/ijms17122118. Int J Mol Sci. 2016. PMID: 27999256 Free PMC article. Review.
Cited by
-
Sorting protein decoys by machine-learning-to-rank.Sci Rep. 2016 Aug 17;6:31571. doi: 10.1038/srep31571. Sci Rep. 2016. PMID: 27530967 Free PMC article.
-
MQAPRank: improved global protein model quality assessment by learning-to-rank.BMC Bioinformatics. 2017 May 25;18(1):275. doi: 10.1186/s12859-017-1691-z. BMC Bioinformatics. 2017. PMID: 28545390 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources