Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Feb;82 Suppl 2(0 2):138-53.
doi: 10.1002/prot.24340. Epub 2013 Aug 31.

Evaluation of residue-residue contact prediction in CASP10

Affiliations

Evaluation of residue-residue contact prediction in CASP10

Bohdan Monastyrskyy et al. Proteins. 2014 Feb.

Abstract

We present the results of the assessment of the intramolecular residue-residue contact predictions from 26 prediction groups participating in the 10th round of the CASP experiment. The most recently developed direct coupling analysis methods did not take part in the experiment likely because they require a very deep sequence alignment not available for any of the 114 CASP10 targets. The performance of contact prediction methods was evaluated with the measures used in previous CASPs (i.e., prediction accuracy and the difference between the distribution of the predicted contacts and that of all pairs of residues in the target protein), as well as new measures, such as the Matthews correlation coefficient, the area under the precision-recall curve and the ranks of the first correctly and incorrectly predicted contact. We also evaluated the ability to detect interdomain contacts and tested whether the difficulty of predicting contacts depends upon the protein length and the depth of the family sequence alignment. The analyses were carried out on the target domains for which structural homologs did not exist or were difficult to identify. The evaluation was performed for all types of contacts (short, medium, and long-range), with emphasis placed on long-range contacts, i.e. those involving residues separated by at least 24 residues along the sequence. The assessment suggests that the best CASP10 contact prediction methods perform at approximately the same level, and comparably to those participating in CASP9.

Keywords: CASP; RR; residue-residue contact prediction.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Number of domains per group for which the L/5 list of long-range contacts were evaluated. Two groups RBO-CON (G334) and FLOUDAS (G077) submitted too few predictions and are not included in the subsequent analyses.
Figure 2
Figure 2
Dendrogram illustrating the similarity among different methods as judged by the number of common predictions for all targets.
Figure 3
Figure 3
Precision (panel A) and cumulative z-score (panel B) for the participating groups on the two sets of the evaluated domains (FM and FM+TBM_hard). The data are shown for the top L/5 long-range contacts. Groups in both panels are ordered according to their cumulative z-score on FM targets.
Figure 4
Figure 4
Precision of the prediction methods as a function of domain length (panel A) and depth of the alignment (panel B). The data are shown for the top L/5 long-range contacts.
Figure 5
Figure 5
PR-curves for all predicted long-range contacts on FM domains.
Figure 6
Figure 6
Percent of cases where the first correct (panel A) and first incorrect (panel B) prediction is in the reported position for each group. Rows are ordered according to the percentage in the first column of panel A. The data are shown for the top L/5 long-range contacts in FM domains.
Figure 7
Figure 7
Example of the prediction of inter-domain contacts for target T0658. This is a two domain protein with the first domain (residues 20–185) being an FM target and the second (residues 186–540) - a template based target. The top panel shows L/5 contacts correctly predicted by at least one group as arcs connecting the corresponding residues indicated by circles. We show all the residues involved in correctly predicted contacts in the first (FM) domain, both intra- and inter-domain, and only the residues involved in correctly predicted inter-domain contacts for the second (TBM) domain. The size of the circle is proportional to the number of contacts the residue makes in the experimental structure. Blue and yellow circles are residues belonging to the first and second domain, respectively. The color of the connecting arcs indicates the frequency with which the corresponding contact was predicted by the groups. Red, green and gray lines indicate contacts predicted with a frequency below the median, between the median and the 3rd quartile and above the 3rd quartile, respectively. The bottom figure shows the three-dimensional structure of the protein with the first domain in blue and the second in yellow. The correctly predicted contacts are indicated by sticks with the same color scheme as the corresponding arcs in the top panel.
Figure 8
Figure 8
Precision of prediction for the top 10 groups in latest three CASPs.

References

    1. Havel TF, Crippen GM, Kuntz ID. Effects of Distance Constraints on Macromolecular Conformation .2. Simulation of Experimental Results and Theoretical Predictions Biopolymers. 1979;18(1):73–81.
    1. Brunger AT, Clore GM, Gronenborn AM, Karplus M. Three-dimensional structure of proteins determined by molecular dynamics with interproton distance restraints: application to crambin. Proc Natl Acad Sci U S A. 1986;83(11):3801–3805. - PMC - PubMed
    1. Clore GM, Nilges M, Brunger AT, Karplus M, Gronenborn AM. A comparison of the restrained molecular dynamics and distance geometry methods for determining three-dimensional structures of proteins on the basis of interproton distances. FEBS Lett. 1987;213(2):269–277. - PubMed
    1. Bohr J, Bohr H, Brunak S, Cotterill RM, Fredholm H, Lautrup B, Petersen SB. Protein structures from distance inequalities. J Mol Biol. 1993;231(3):861–869. - PubMed
    1. Saitoh S, Nakai T, Nishikawa K. A geometrical constraint approach for reproducing the native backbone conformation of a protein. Proteins. 1993;15(2):191–204. - PubMed

Publication types