Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec;89(12):1987-1996.
doi: 10.1002/prot.26231. Epub 2021 Oct 5.

Modeling SARS-CoV-2 proteins in the CASP-commons experiment

Affiliations

Modeling SARS-CoV-2 proteins in the CASP-commons experiment

Andriy Kryshtafovych et al. Proteins. 2021 Dec.

Abstract

Critical Assessment of Structure Prediction (CASP) is an organization aimed at advancing the state of the art in computing protein structure from sequence. In the spring of 2020, CASP launched a community project to compute the structures of the most structurally challenging proteins coded for in the SARS-CoV-2 genome. Forty-seven research groups submitted over 3000 three-dimensional models and 700 sets of accuracy estimates on 10 proteins. The resulting models were released to the public. CASP community members also worked together to provide estimates of local and global accuracy and identify structure-based domain boundaries for some proteins. Subsequently, two of these structures (ORF3a and ORF8) have been solved experimentally, allowing assessment of both model quality and the accuracy estimates. Models from the AlphaFold2 group were found to have good agreement with the experimental structures, with main chain GDT_TS accuracy scores ranging from 63 (a correct topology) to 87 (competitive with experiment).

Keywords: CASP; COVID; EMA; SARS-CoV-2; model accuracy; protein structure prediction.

PubMed Disclaimer

Figures

FIGURE 1
FIGURE 1
Screenshot of the model consensus table (https://predictioncenter.org/caspcommons/models_consensus2.cgi) for the SARS‐CoV‐2 M‐protein (target C1906) showing local structural agreement along the sequence of the selected model (second column) with the remaining models. The black box shows the region where many models agree, suggesting a relatively easy to model domain
FIGURE 2
FIGURE 2
Maximum consensus scores on CASP‐COVID targets (EMA‐jury—gray bars; overall consensus—black). Targets are ordered by increasing EMA‐jury values. The gray bars are always longer than black ones, indicating that the EMA‐jury method successfully selects subsets of models that are more structurally consistent. The vertical dashed line corresponds to the consensus level of 0.6, which represents 100th percentile of overall consensus scores for all models (Figure SFQA4). CASP, Critical Assessment of Structure Prediction; CASP‐COVID, CASP community‐wide experiment on modeling SARS‐CoV‐2 proteins causing the coronavirus disease; EMA, estimates of model accuracy;
FIGURE 3
FIGURE 3
Selection of the top model by the estimates of model accuracy (EMA)‐jury (top panel) and simple structural consensus (bottom panel) on 80 CASP13 targets. Maximum per‐target CAD‐scores are shown as pointing up triangles; the CAD‐scores of models selected by the EMA‐jury approach (top) and simple structural consensus method (bottom) are shown as pointing down triangles. The hardest to predict targets (FM) are in red, others in green. Vertical lines between the corresponding triangles represent the error in the selection process. Comparison of the top and bottom panels demonstrates that the EMA‐jury method selects models closer to the best absolute value more often than the simple consensus
FIGURE 4
FIGURE 4
Round 1 three‐dimensional (3D) and accuracy estimation results for SARS2 ORF3a (C1905). (A) Each green cross represents a 3D model, black squares indicate models selected as high accuracy by accuracy estimation methods, and orange circles indicate models selected by the estimates of model accuracy (EMA)‐Jury method. 3D model accuracy is shown in terms of LDDT (y‐axis) and GDT_TS (x‐axis). Only one accuracy estimation method selected a higher accuracy model. (B) Locally inaccurate regions of the highest‐scoring model, AF‐COV_2, according to the ULR definition (left) and as predicted for the same model by the BAKER EMA method (right). The superpositions are identical; the crystal structure is in yellow, ULRs and predicted inaccurate regions are in red and the rest of the model in green
FIGURE 5
FIGURE 5
Round 2 3D and accuracy estimation results for two domains of SARS‐CoV‐2 ORF3a protein (A) C1905‐D1 and (B) C1905‐D2. 3D model accuracy is shown in terms of LDDT (y‐axis) and GDT_TS (x‐axis) (green crosses). The panels show both models from CASP‐COVID and AF‐COV models added in the post‐CASP EMA experiment (pink stars). The models selected by EMA methods as top1 during CASP‐COVID are shown as black hollow squares; models selected in the post‐CASP experiment are in pink hollow squares. For Domain 1, three out of four EMA groups selected one of the higher accuracy AlphaFold models, with many low accuracy models also selected. There is a similar pattern for Domain 2, where two of four methods picked two different AlphaFold models
FIGURE 6
FIGURE 6
Round 1 3D modeling and accuracy estimation (EMA) results for SARS‐CoV‐2 protein ORF8 (C1908). 3D model accuracy for submissions in terms of LDDT (y‐axis) and GDT_TS (x‐axis) (green crosses) and EMA selections (black squares for CASP‐COVID, pink squares for post‐CASP experiment, orange circles for EMA‐Jury). Five AF2 models added in the post‐CASP experiment are shown as pink stars. Two of the AF2 models are impressively accurate. Two post‐CASP EMA methods succeeded in selecting those models as best

References

    1. Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical assessment of methods of protein structure prediction (CASP)‐round XIII. Proteins. 2019;87(12):1011‐1020. - PMC - PubMed
    1. Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A. Critical assessment of methods of protein structure prediction (CASP)‐round XII. Proteins. 2018;86(suppl 1):7‐15. - PMC - PubMed
    1. Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A. Critical assessment of methods of protein structure prediction: Progress and new directions in round XI. Proteins. 2016;84(suppl 1):4‐14. - PMC - PubMed
    1. Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A. Critical assessment of methods of protein structure prediction (CASP)—round x. Proteins. 2014;82(suppl 2):1‐6. - PMC - PubMed
    1. Moult J, Fidelis K, Kryshtafovych A, Tramontano A. Critical assessment of methods of protein structure prediction (CASP)—round IX. Proteins. 2011;79(Suppl 10):1‐5. - PMC - PubMed

Publication types