Predicting DNA recognition by Cys2His2 zinc finger proteins
- PMID: 19008249
- PMCID: PMC2638941
- DOI: 10.1093/bioinformatics/btn580
Predicting DNA recognition by Cys2His2 zinc finger proteins
Abstract
Motivation: Cys(2)His(2) zinc finger (ZF) proteins represent the largest class of eukaryotic transcription factors. Their modular structure and well-conserved protein-DNA interface allow the development of computational approaches for predicting their DNA-binding preferences even when no binding sites are known for a particular protein. The 'canonical model' for ZF protein-DNA interaction consists of only four amino acid nucleotide contacts per zinc finger domain.
Results: We present an approach for predicting ZF binding based on support vector machines (SVMs). While most previous computational approaches have been based solely on examples of known ZF protein-DNA interactions, ours additionally incorporates information about protein-DNA pairs known to bind weakly or not at all. Moreover, SVMs with a linear kernel can naturally incorporate constraints about the relative binding affinities of protein-DNA pairs; this type of information has not been used previously in predicting ZF protein-DNA binding. Here, we build a high-quality literature-derived experimental database of ZF-DNA binding examples and utilize it to test both linear and polynomial kernels for predicting ZF protein-DNA binding on the basis of the canonical binding model. The polynomial SVM outperforms previously published prediction procedures as well as the linear SVM. This may indicate the presence of dependencies between contacts in the canonical binding model and suggests that modification of the underlying structural model may result in further improved performance in predicting ZF protein-DNA binding. Overall, this work demonstrates that methods incorporating information about non-binding and relative binding of protein-DNA pairs have great potential for effective prediction of protein-DNA interactions.
Availability: An online tool for predicting ZF DNA binding is available at http://compbio.cs.princeton.edu/zf/.
Figures


Similar articles
-
An expanded binding model for Cys2His2 zinc finger protein-DNA interfaces.Phys Biol. 2011 Jun;8(3):035010. doi: 10.1088/1478-3975/8/3/035010. Epub 2011 May 13. Phys Biol. 2011. PMID: 21572177 Free PMC article.
-
De novo prediction of DNA-binding specificities for Cys2His2 zinc finger proteins.Nucleic Acids Res. 2014 Jan;42(1):97-108. doi: 10.1093/nar/gkt890. Epub 2013 Oct 3. Nucleic Acids Res. 2014. PMID: 24097433 Free PMC article.
-
Structural metal sites in nonclassical zinc finger proteins involved in transcriptional and translational regulation.Acc Chem Res. 2014 Aug 19;47(8):2643-50. doi: 10.1021/ar500182d. Epub 2014 Aug 6. Acc Chem Res. 2014. PMID: 25098749
-
DNA recognition by Cys2His2 zinc finger proteins.Annu Rev Biophys Biomol Struct. 2000;29:183-212. doi: 10.1146/annurev.biophys.29.1.183. Annu Rev Biophys Biomol Struct. 2000. PMID: 10940247 Review.
-
Structural recognition of DNA by poly(ADP-ribose)polymerase-like zinc finger families.FEBS J. 2008 Mar;275(5):883-93. doi: 10.1111/j.1742-4658.2008.06259.x. Epub 2008 Jan 19. FEBS J. 2008. PMID: 18215166 Review.
Cited by
-
Re-programming DNA-binding specificity in zinc finger proteins for targeting unique address in a genome.Syst Synth Biol. 2010 Dec;4(4):323-9. doi: 10.1007/s11693-011-9077-4. Epub 2011 Feb 19. Syst Synth Biol. 2010. PMID: 22132059 Free PMC article.
-
Exploring the DNA-recognition potential of homeodomains.Genome Res. 2012 Oct;22(10):1889-98. doi: 10.1101/gr.139014.112. Epub 2012 Apr 26. Genome Res. 2012. PMID: 22539651 Free PMC article.
-
ZFNGenome: a comprehensive resource for locating zinc finger nuclease target sites in model organisms.BMC Genomics. 2011 Jan 28;12:83. doi: 10.1186/1471-2164-12-83. BMC Genomics. 2011. PMID: 21276248 Free PMC article.
-
An expanded binding model for Cys2His2 zinc finger protein-DNA interfaces.Phys Biol. 2011 Jun;8(3):035010. doi: 10.1088/1478-3975/8/3/035010. Epub 2011 May 13. Phys Biol. 2011. PMID: 21572177 Free PMC article.
-
Sequence specificity is obtained from the majority of modular C2H2 zinc-finger arrays.Nucleic Acids Res. 2011 Jun;39(11):4680-90. doi: 10.1093/nar/gkq1303. Epub 2011 Feb 14. Nucleic Acids Res. 2011. PMID: 21321018 Free PMC article.
References
-
- Benos PV, et al. SAMIE: statistical algorithm for modeling interaction energies. Pac. Symp. Biocomput. 2001;6:115–126. - PubMed
-
- Benos PV, et al. Probabilistic code for DNA recognition by proteins of the EGR family. J. Mol. Biol. 2002;323:701–727. - PubMed
-
- Blancafort P, et al. Scanning the human genome with combinatorial transcription factor libraries. Nat. Biotechnol. 2003;21:269–274. - PubMed