Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004 Jun 18:4:8.
doi: 10.1186/1472-6807-4-8.

Improved protein structure selection using decoy-dependent discriminatory functions

Affiliations

Improved protein structure selection using decoy-dependent discriminatory functions

Kai Wang et al. BMC Struct Biol. .

Abstract

Background: A key component in protein structure prediction is a scoring or discriminatory function that can distinguish near-native conformations from misfolded ones. Various types of scoring functions have been developed to accomplish this goal, but their performance is not adequate to solve the structure selection problem. In addition, there is poor correlation between the scores and the accuracy of the generated conformations.

Results: We present a simple and nonparametric formula to estimate the accuracy of predicted conformations (or decoys). This scoring function, called the density score function, evaluates decoy conformations by performing an all-against-all Calpha RMSD (Root Mean Square Deviation) calculation in a given decoy set. We tested the density score function on 83 decoy sets grouped by their generation methods (4state_reduced, fisa, fisa_casp3, lmds, lattice_ssfit, semfold and Rosetta). The density scores have correlations as high as 0.9 with the Calpha RMSDs of the decoy conformations, measured relative to the experimental conformation for each decoy. We previously developed a residue-specific all-atom probability discriminatory function (RAPDF), which compiles statistics from a database of experimentally determined conformations, to aid in structure selection. Here, we present a decoy-dependent discriminatory function called self-RAPDF, where we compiled the atom-atom contact probabilities from all the conformations in a decoy set instead of using an ensemble of native conformations, with a weighting scheme based on the density scores. The self-RAPDF has a higher correlation with Calpha RMSD than RAPDF for 76/83 decoy sets, and selects better near-native conformations for 62/83 decoy sets. Self-RAPDF may be useful not only for selecting near-native conformations from decoy sets, but also for fold simulations and protein structure refinement.

Conclusions: Both the density score and the self-RAPDF functions are decoy-dependent scoring functions for improved protein structure selection. Their success indicates that information from the ensemble of decoy conformations can be used to derive statistical probabilities and facilitate the identification of near-native structures.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Pairwise RMSD matrix plot for 1ctf in the 4state_reduced, lattice_ssfit, lmds and semfold decoy sets. Each column or row represents one decoy conformation, and each cell represents the pairwise RMSD between the two conformations that correspond to the row and the column. Both columns and rows are ordered by the Cα RMSD between the corresponding decoy conformation and the experimentally determined conformation. The color of the cells reflects the value of the pairwise RMSD between two decoys: the darker the cell, the lower the pairwise RMSD. The dimension of the four matrices are 630 × 630 (a), 1000 × 1000 (b), 497 × 497 (c) and 1000 × 1000 (d), respectively. Low-RMSD decoy conformations tend to have lower pairwise RMSD with each other.
Figure 2
Figure 2
Histogram of Cα RMSDs relative to experimentally determined conformation for the 1ctf protein in the 4state_reduced, lattice_ssfit, lmds and semfold sets. There are two peaks for the 4state_reduced set though only one scoring basin was found in Figure 1. There are three peaks for lmds set, which happen to represent three scoring basins where decoy conformations tend to accumulate.
Figure 3
Figure 3
Comparison of the performance of RAPDF and self-RAPDF on 83 decoy sets grouped by their generation methods. The average value and standard error of log PB1, log PB10, fraction enrichment (F.E.) and correlation coefficient (C.C.) for each group of sets are shown. In most cases, self-RAPDF performs better than RAPDF.
Figure 4
Figure 4
Histogram of log PB1 (upper panel) and correlation coefficient between RMSD relative to experimentally determined conformation and scores (lower panel) generated by the RAPDF, the density score function and the self-RAPDF for the 41 decoy sets generated by the Rosetta method. Both the density score function and self-RAPDF perform much better than RAPDF for these sets.
Figure 5
Figure 5
Self-RAPDF score versus Cα RMSD for 41 most recent Rosetta 10-14-01 decoy sets [30]. For most sets, self-RAPDF scores tend to have high correlation with Cα RMSDs between decoys and experimentally determined conformations.
Figure 6
Figure 6
Scatter plot of RAPDF score or self-RAPDF score versus Cα RMSD for six semfold decoy sets. Both RAPDF score and self-RAPDF score do not discriminate decoys well on these sets.

Similar articles

Cited by

References

    1. Brooks BR, Bruccoleri RE, Olafson BD, States DJ, Swaminathan S, Karplus M. CHARMM: a program for macromolecular energy minimization and dynamics calculations. J Comput Chem. 1983;4:187–217.
    1. Jorgensen William L., Tirado-Rives Julian. The OPLS potential functions for proteins, energy minimizations for crystals of cyclic peptides and crambin. J Am Chem Soc. 1988;110:1657–1666. - PubMed
    1. Cornell WD, Cieplak P, Bayly CI, Gould IR, Merz KM, Ferguson DM, Spellmeyer DC, Fox T, Caldwell JW, Kollman PA. A second generation force field for the simulation of proteins, nucleic acids, and organic molecules. J Am Chem Soc. 1995;117:5179–5197.
    1. Fain B, Xia Y, Levitt M. Design of an optimal Chebyshev-expanded discrimination function for globular proteins. Protein Sci. 2002;11:2010–2021. doi: 10.1110/ps.0200702. - DOI - PMC - PubMed
    1. Holm L, Sander C. Evaluation of protein models by atomic solvation preference. J Mol Biol. 1992;225:93–105. - PubMed

Publication types

MeSH terms

Substances