Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec;89(12):1977-1986.
doi: 10.1002/prot.26213. Epub 2021 Aug 19.

Continuous Automated Model EvaluatiOn (CAMEO)-Perspectives on the future of fully automated evaluation of structure prediction methods

Affiliations

Continuous Automated Model EvaluatiOn (CAMEO)-Perspectives on the future of fully automated evaluation of structure prediction methods

Xavier Robin et al. Proteins. 2021 Dec.

Abstract

The Continuous Automated Model EvaluatiOn (CAMEO) platform complements the biennial CASP experiment by conducting fully automated blind evaluations of three-dimensional protein prediction servers based on the weekly prerelease of sequences of those structures, which are going to be published in the upcoming release of the Protein Data Bank. While in CASP14, significant success was observed in predicting the structures of individual protein chains with high accuracy, significant challenges remain in correctly predicting the structures of complexes. By implementing fully automated evaluation of predictions for protein-protein complexes, as well as for proteins in complex with ligands, peptides, nucleic acids, or proteins containing noncanonical amino acid residues, CAMEO will assist new developments in those challenging areas of active research.

Keywords: benchmarking; blind assessment; continuous evaluation; ligands; macromolecular complexes; molecular structure prediction; non-canonical residues.

PubMed Disclaimer

Figures

FIGURE 1
FIGURE 1
Target 2020‐12‐19_00000231 (PDB ID 7 K93) is a hetero‐2‐2‐mer protein complex of a Dengue virus nonstructural protein (NS1) (green) in complex with a mouse neutralizing single chain Fab variable region (orange). While templates can be easily identified with HHblits for both entities, there is no overlap between the template lists, meaning the two proteins have never been observed in a homologous complex. Specifically, no homologs of this Dengue virus protein have been observed in complex with an antibody. Hence, this constitutes an interesting target for modeling heteromeric protein complexes
FIGURE 2
FIGURE 2
Hypothetical hetero‐2‐2‐mer target (AABB, left) with a ligand, and a hypothetical model of the target (right). (1) The lDDT score assesses the accuracy of each individual chain and measures local and global differences between model and reference structure. When more than one chain is predicted for an entity (B1, B2), only the best‐scoring one (B2) is kept. (2) The oligo‐lDDT score assesses the accuracy of all chains simultaneously while penalizing for missing (A1) or extra chains. (3) The QS‐score assesses the accuracy of the interface(s) between chains. It identifies correct (green dashed line) and inaccurate (orange dashed line) interfaces, and penalizes missing (red dashed line) interfaces. (4) The lDDT‐BS score assesses the accuracy of the binding site of biologically relevant ligands (gray circle, center). (5) Ligand scores assess the accuracy of the ligand (yellow) pose
FIGURE 3
FIGURE 3
Target 2020‐05‐09_00000305 (PDB ID 7BRP) is a structure of the SARS‐CoV‐2 main protease in complex with Boceprevir. At the time of prerelease, the structure of the protease had already been solved, and was therefore a trivial modeling target on its own. However, it had not been observed in complex with Boceprevir, and therefore, this complex represents a challenging ligand modeling target
FIGURE 4
FIGURE 4
Target 2020‐05‐30_00000276 (PDB ID 6LQF) is an ARID‐PHD protein cassette in complex with a peptide, DNA, and zinc ions. The protein only has remote similarity (<30% sequence identity) to known structures, and none of them are in complex with DNA or the H3K4me3 peptide, making it an extremely challenging target. We are not aware of any methods that would currently be able to model this type of complex with acceptable accuracy. It should be noted that the peptide contains a noncanonical residue (N‐Trimethyllysine, derived from lysine)

References

    1. wwPDB consortium , Burley SK, Berman HM, et al. Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Res. 2018;47(D1):D520‐D528. - PMC - PubMed
    1. Li W, Godzik A. Cd‐hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22(13):1658‐1659. 10.1093/bioinformatics/btl158 - DOI - PubMed
    1. Camacho C, Coulouris G, Avagyan V, et al. BLAST+: architecture and applications. BMC Bioinform. 2009;10:421. 10.1186/1471-2105-10-421 - DOI - PMC - PubMed
    1. Steinegger M, Meier M, Mirdita M, Vöhringer H, Haunsberger SJ, Söding J. HH‐suite3 for fast remote homology detection and deep protein annotation. BMC Bioinform. 2019;20(1):473. - PMC - PubMed
    1. Mirdita M, von den Driesch L, Galiez C, Martin MJ, Söding J, Steinegger M. Uniclust databases of clustered and deeply annotated protein sequences and alignments. Nucleic Acids Res. 2017;45(D1):D170‐D176. - PMC - PubMed

Publication types

LinkOut - more resources