Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Aug 1;26(15):1857-63.
doi: 10.1093/bioinformatics/btq295. Epub 2010 Jun 4.

Structure-based prediction of DNA-binding proteins by structural alignment and a volume-fraction corrected DFIRE-based energy function

Affiliations

Structure-based prediction of DNA-binding proteins by structural alignment and a volume-fraction corrected DFIRE-based energy function

Huiying Zhao et al. Bioinformatics. .

Abstract

Motivation: Template-based prediction of DNA binding proteins requires not only structural similarity between target and template structures but also prediction of binding affinity between the target and DNA to ensure binding. Here, we propose to predict protein-DNA binding affinity by introducing a new volume-fraction correction to a statistical energy function based on a distance-scaled, finite, ideal-gas reference (DFIRE) state.

Results: We showed that this energy function together with the structural alignment program TM-align achieves the Matthews correlation coefficient (MCC) of 0.76 with an accuracy of 98%, a precision of 93% and a sensitivity of 64%, for predicting DNA binding proteins in a benchmark of 179 DNA binding proteins and 3797 non-binding proteins. The MCC value is substantially higher than the best MCC value of 0.69 given by previous methods. Application of this method to 2235 structural genomics targets uncovered 37 as DNA binding proteins, 27 (73%) of which are putatively DNA binding and only 1 protein whose annotated functions do not contain DNA binding, while the remaining proteins have unknown function. The method provides a highly accurate and sensitive technique for structure-based prediction of DNA binding proteins.

Availability: The method is implemented as a part of the Structure-based function-Prediction On-line Tools (SPOT) package available at http://sparks.informatics.iupui.edu/spot

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Sensitivity versus false positive rate, given by DDNA3 (filled black circles) and DDNA2 (open red circles) reveals the importance of an appropriate reference state for method performance in predicting DNA binding proteins. The results of other methods are adapted from Gao and Skolnick (2008). DDNA3U (open black circles) is the sensitivity versus false positive rate given by DDNA3 based on updated DB250 dataset. TM-score-dependent energy-score thresholds lead to DDNA3O (open diamond) and DDNA3OU (red filled diamond), compared to optimized DBD-Hunter (open green triangle).
Fig. 2.
Fig. 2.
(a) Structural comparison between APO target protein 1mjkA (green) and template protein 1ea4A (red). The TM-score between them is 0.79 and the interaction energy between 1mjkA and template DNA is −20.9. (b) Structural comparison between HOLO target protein 1mjmA (green) and template protein (1ea4A). The TM-score between them is 0.76 and the interaction energy between 1mjmA and template DNA is −20.6.
Fig. 3.
Fig. 3.
(a) Structural comparison between APO target 1f43A and HOLO target 1le8A. Red: fragment of binding domain of 1f43A. Green: fragment of binding domain of 1le8A. Orange: template DNA of 2bamB. (b) Structural comparison between APO target 1jyfA (red) and HOLO target 1efaA (green). Orange: template DNA of 1rzrA.

Similar articles

Cited by

References

    1. Ahmad S, et al. Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information. Bioinformatics. 2004;20:477–486. - PubMed
    1. Angarica VE, et al. Prediction of TF target sites based on atomistic models of protein-DNA complexes. BMC Bioinformatics. 2008;9:436. - PMC - PubMed
    1. Bhardwaj N, et al. Kernel-based machine learning protocol for predicting DNA-binding proteins. Nucleic Acids Res. 2005;33:6486–6493. - PMC - PubMed
    1. Burley SK. An overview of structural genomics. Nat. Struct. Biol. 2000;7:932–934. - PubMed
    1. Cai Y.-d, Lin SL. Support vector machines for predicting rRNA-, RNA-, and DNA-binding proteins from amino acid sequence. Biochim. Biophys. Acta. 2003;1648:127–133. - PubMed

Publication types