Predicting DNA-binding proteins and binding residues by complex structure prediction and application to human proteome
- PMID: 24792350
- PMCID: PMC4008587
- DOI: 10.1371/journal.pone.0096694
Predicting DNA-binding proteins and binding residues by complex structure prediction and application to human proteome
Abstract
As more and more protein sequences are uncovered from increasingly inexpensive sequencing techniques, an urgent task is to find their functions. This work presents a highly reliable computational technique for predicting DNA-binding function at the level of protein-DNA complex structures, rather than low-resolution two-state prediction of DNA-binding as most existing techniques do. The method first predicts protein-DNA complex structure by utilizing the template-based structure prediction technique HHblits, followed by binding affinity prediction based on a knowledge-based energy function (Distance-scaled finite ideal-gas reference state for protein-DNA interactions). A leave-one-out cross validation of the method based on 179 DNA-binding and 3797 non-binding protein domains achieves a Matthews correlation coefficient (MCC) of 0.77 with high precision (94%) and high sensitivity (65%). We further found 51% sensitivity for 82 newly determined structures of DNA-binding proteins and 56% sensitivity for the human proteome. In addition, the method provides a reasonably accurate prediction of DNA-binding residues in proteins based on predicted DNA-binding complex structures. Its application to human proteome leads to more than 300 novel DNA-binding proteins; some of these predicted structures were validated by known structures of homologous proteins in APO forms. The method [SPOT-Seq (DNA)] is available as an on-line server at http://sparks-lab.org.
Conflict of interest statement
Figures



Similar articles
-
Structure-based prediction of DNA-binding proteins by structural alignment and a volume-fraction corrected DFIRE-based energy function.Bioinformatics. 2010 Aug 1;26(15):1857-63. doi: 10.1093/bioinformatics/btq295. Epub 2010 Jun 4. Bioinformatics. 2010. PMID: 20525822 Free PMC article.
-
Prediction and validation of the unexplored RNA-binding protein atlas of the human proteome.Proteins. 2014 Apr;82(4):640-7. doi: 10.1002/prot.24441. Epub 2013 Nov 22. Proteins. 2014. PMID: 24123256 Free PMC article.
-
Carbohydrate-binding protein identification by coupling structural similarity searching with binding affinity prediction.J Comput Chem. 2014 Nov 15;35(30):2177-83. doi: 10.1002/jcc.23730. Epub 2014 Sep 15. J Comput Chem. 2014. PMID: 25220682
-
Predicting Protein-Protein Interactions from the Molecular to the Proteome Level.Chem Rev. 2016 Apr 27;116(8):4884-909. doi: 10.1021/acs.chemrev.5b00683. Epub 2016 Apr 13. Chem Rev. 2016. PMID: 27074302 Review.
-
Computational prediction of DNA-protein interactions: a review.Curr Comput Aided Drug Des. 2010 Sep;6(3):197-206. doi: 10.2174/157340910791760091. Curr Comput Aided Drug Des. 2010. PMID: 20438443 Review.
Cited by
-
SNBRFinder: A Sequence-Based Hybrid Algorithm for Enhanced Prediction of Nucleic Acid-Binding Residues.PLoS One. 2015 Jul 15;10(7):e0133260. doi: 10.1371/journal.pone.0133260. eCollection 2015. PLoS One. 2015. PMID: 26176857 Free PMC article.
-
A comprehensive review of protein-centric predictors for biomolecular interactions: from proteins to nucleic acids and beyond.Brief Bioinform. 2024 Mar 27;25(3):bbae162. doi: 10.1093/bib/bbae162. Brief Bioinform. 2024. PMID: 38739759 Free PMC article. Review.
-
Prediction of RNA- and DNA-Binding Proteins Using Various Machine Learning Classifiers.Avicenna J Med Biotechnol. 2019 Jan-Mar;11(1):104-111. Avicenna J Med Biotechnol. 2019. PMID: 30800250 Free PMC article.
-
Deep-WET: a deep learning-based approach for predicting DNA-binding proteins using word embedding techniques with weighted features.Sci Rep. 2024 Feb 5;14(1):2961. doi: 10.1038/s41598-024-52653-9. Sci Rep. 2024. PMID: 38316843 Free PMC article.
-
HOMCOS: an updated server to search and model complex 3D structures.J Struct Funct Genomics. 2016 Dec;17(4):83-99. doi: 10.1007/s10969-016-9208-y. Epub 2016 Aug 13. J Struct Funct Genomics. 2016. PMID: 27522608 Free PMC article.
References
-
- Stawiski EW, Gregoret LM, Mandel-Gutfreund Y (2003) Annotating nucleic acid-binding function based on protein structure. Journal of Molecular Biology 326: 1065–1079. - PubMed
-
- Cai YD, Lin SL (2003) Support vector machines for predicting rRNA-, RNA-, and DNA-binding proteins from amino acid sequence. Biochimica Et Biophysica Acta-Proteins and Proteomics 1648: 127–133. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
Miscellaneous