Application of information theory to feature selection in protein docking
- PMID: 21748327
- DOI: 10.1007/s00894-011-1157-6
Application of information theory to feature selection in protein docking
Abstract
In the era of structural genomics, the prediction of protein interactions using docking algorithms is an important goal. The success of this method critically relies on the identification of good docking solutions among a vast excess of false solutions. We have adapted the concept of mutual information (MI) from information theory to achieve a fast and quantitative screening of different structural features with respect to their ability to discriminate between physiological and nonphysiological protein interfaces. The strategy includes the discretization of each structural feature into distinct value ranges to optimize its mutual information. We have selected 11 structural features and two datasets to demonstrate that the MI is dimensionless and can be directly compared for diverse structural features and between datasets of different sizes. Conversion of the MI values into a simple scoring function revealed that those features with a higher MI are actually more powerful for the identification of good docking solutions. Thus, an MI-based approach allows the rapid screening of structural features with respect to their information content and should therefore be helpful for the design of improved scoring functions in future. In addition, the concept presented here may also be adapted to related areas that require feature selection for biomolecules or organic ligands.
Similar articles
-
An information-theoretic classification of amino acids for the assessment of interfaces in protein-protein docking.J Mol Model. 2013 Sep;19(9):3901-10. doi: 10.1007/s00894-013-1916-7. Epub 2013 Jul 5. J Mol Model. 2013. PMID: 23828247
-
The scoring bias in reverse docking and the score normalization strategy to improve success rate of target fishing.PLoS One. 2017 Feb 14;12(2):e0171433. doi: 10.1371/journal.pone.0171433. eCollection 2017. PLoS One. 2017. PMID: 28196116 Free PMC article.
-
Protein-protein docking with binding site patch prediction and network-based terms enhanced combinatorial scoring.Proteins. 2010 Nov 15;78(15):3150-5. doi: 10.1002/prot.22831. Proteins. 2010. PMID: 20806233
-
Exploring the potential of global protein-protein docking: an overview and critical assessment of current programs for automatic ab initio docking.Drug Discov Today. 2015 Aug;20(8):969-77. doi: 10.1016/j.drudis.2015.03.007. Epub 2015 Mar 20. Drug Discov Today. 2015. PMID: 25801181 Review.
-
Predicting 3D structures of protein-protein complexes.Curr Pharm Biotechnol. 2008 Apr;9(2):57-66. doi: 10.2174/138920108783955209. Curr Pharm Biotechnol. 2008. PMID: 18393862 Review.
Cited by
-
An information-theoretic classification of amino acids for the assessment of interfaces in protein-protein docking.J Mol Model. 2013 Sep;19(9):3901-10. doi: 10.1007/s00894-013-1916-7. Epub 2013 Jul 5. J Mol Model. 2013. PMID: 23828247
-
Scoring docking conformations using predicted protein interfaces.BMC Bioinformatics. 2014 Jun 6;15:171. doi: 10.1186/1471-2105-15-171. BMC Bioinformatics. 2014. PMID: 24906633 Free PMC article.
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources