Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Feb;82 Suppl 2(0 2):26-42.
doi: 10.1002/prot.24489.

Challenging the state of the art in protein structure prediction: Highlights of experimental target structures for the 10th Critical Assessment of Techniques for Protein Structure Prediction Experiment CASP10

Affiliations

Challenging the state of the art in protein structure prediction: Highlights of experimental target structures for the 10th Critical Assessment of Techniques for Protein Structure Prediction Experiment CASP10

Andriy Kryshtafovych et al. Proteins. 2014 Feb.

Erratum in

  • Proteins. 2015 Jun;83(6):1198

Abstract

For the last two decades, CASP has assessed the state of the art in techniques for protein structure prediction and identified areas which required further development. CASP would not have been possible without the prediction targets provided by the experimental structural biology community. In the latest experiment, CASP10, more than 100 structures were suggested as prediction targets, some of which appeared to be extraordinarily difficult for modeling. In this article, authors of some of the most challenging targets discuss which specific scientific question motivated the experimental structure determination of the target protein, which structural features were especially interesting from a structural or functional perspective, and to what extent these features were correctly reproduced in the predictions submitted to CASP10. Specifically, the following targets will be presented: the acid-gated urea channel, a difficult to predict transmembrane protein from the important human pathogen Helicobacter pylori; the structure of human interleukin (IL)-34, a recently discovered helical cytokine; the structure of a functionally uncharacterized enzyme OrfY from Thermoproteus tenax formed by a gene duplication and a novel fold; an ORFan domain of mimivirus sulfhydryl oxidase R596; the fiber protein gene product 17 from bacteriophage T7; the bacteriophage CBA-120 tailspike protein; a virus coat protein from metagenomic samples of the marine environment; and finally, an unprecedented class of structure prediction targets based on engineered disulfide-rich small proteins.

Keywords: CASP; NMR; X-ray crystallography; critical assessment; model quality; protein structure prediction.

PubMed Disclaimer

Figures

Figure 1
Figure 1. This composite shows the enlarged membrane-embedded hexameric ring of urea channels next to an electron micrograph of a Helicobacter pylori cell
Urea passes through the center of each of the six channel molecules (two green, two red and two blue molecules). The center of the ring is filled with a lipid bilayer plug. (Credit: Hartmut Luecke / UC Irvine and Andy Freeberg / SLAC National Accelerator Laboratory).
Figure 2
Figure 2. Helical cytokine fold of human IL-34
(A) Comparison of IL-34 (PDB: 4DKC) and “best template” CSF-1 (PDB: 3UF2) chains by SSM superposition, with scant identities boxed in reverse lettering. PSIPRED-defined conserved sequences (with capital letters indicating nearly invariant residues) are adjacent to the X-ray chains. Cys residues are boxed in red, and disulfide-links are noted; the CSF-1 Cys involved in an intermolecular disulfide bridge is marked with a red arrow. The N-gly site in IL-34 is boxed in green, and marked by a topmost green pentagon. Exon junctions in the corresponding IL-34 and CSF-1 genes are mapped to the chains with yellow arrows, guiding their alignment. , The four core helices are color ramped blue to red (and labeled A-D), β-strands are named β1 and β2, and extracore helices noted in grey. (B) Human IL-34 (target R0007) and its best template, CSF-1, are aligned, the chains color-ramped blue to red, and secondary structure labeled as in panel A. Disulfide bridges and the IL-34 glycan chain are highlighted. The compact IL-15 (PDB: 2Z3Q) is the closest short-chain helical cytokine match to IL-34 by SSM search (2.3 Å RMSD). Structures drawn by Pymol (www.pymol.org).
Figure 3
Figure 3. The structure of OrfY from T. tenax
(A) Structure of the monomeric protein in cartoon representation with the two homologous domains colour coded in orange and splitpea green. The secondary structure elements are depicted and assigned with β1 - β2 for the strands and α1 - α10 for the helices. The structure is shown in two different orientations related to each other by a rotation of 90 degrees around the Y-axis. The pseudo-twofold symmetric axis is indicated by a dashed arrow and C2. (B) The extended anti-parallel α-helix motif is shown formed by the β1 and β2 strands and the H-bond pattern is indicated. (C) Conserved residues forming the presumptive sugar binding site and mapped on the structure in surface representation.
Figure 4
Figure 4. Hard-knock helices in the ORFan domain of mimivirus sulfhydryl oxidase R596
Left, a molecular surface representation of the mimivirus disulfide catalyst R596 dimer is shown with the ORFan domains highlighted as dark blue ribbons. The flavin adenine dinucleotide (FAD) cofactor is in orange sticks, and the sulfur atoms in the FAD-proximal, redox-active disulfide bonds are shown as yellow spheres. Right, the features of the ORFan domain described in the text are labeled on the domain structure. Charged side chains arising from buried positions in ORFan domain helices are shown in stick representation and labeled. A small fragment of the adjacent Erv domain is shown in gray, displaying the glutamate residue that caps the ORFan terminal helix.
Figure 5
Figure 5. Bacteriophage T7 and its fibre protein gp17
A, B. Schematic diagram of bacteriophage T7 in the free state (A) and when bound to its host E. coli (B). The crystallized fragment is shown in a grey box. C. Cartoon representation of the structure of the C-terminal part of the gp17 trimer containing amino acids 371 to 553 (PDB: 4a0u). The N- and C-terminal ends are indicated. Residues that are thought to be important for host attachment and host range determination (518, 520 and 544) are shown as sticks (prepared using the PyMOL Molecular Graphics System, Version 1.4.1 Schrödinger LLC).
Figure 6
Figure 6. Crystal Structure of Bacteriophage CBA-120 tailspike
(A) The overall structure of Tailspike TSP1 homo-trimer. (B) The structure of a TSP1 monomer. In CASP, the structures were assessed in full-length and parsed into four structural domains (D1 to D4). The N-terminal head binding domain includes D1 (residues 12-96, colored blue) and D2 (residues 97-154, colored red). The ligand binding domain includes D3 (residues 198-580, colored green) and D4 (residues 581-796, colored grey and cyan). The ligand binding domain assumes primarily right-handed parallel β-helical structure, but the helical axis is bent by an intervening fragment (residues 581-623) colored grey.
Figure 7
Figure 7. Cartoon representation of 4DMI
This putative coat protein contains a semi-flexible linker between highly structured N- and C-terminal domains (A). Each of the chains is colored differently to highlight the interconnectedness of the N-terminal domain (B).
Figure 8
Figure 8
Solution NMR structure of 2.5D (PDB ID 2M7T) represented as 20 lowest energy conformers superimposed using the THESEUS maximum likelihood method. The engineered integrin-binding loop is rendered in blue and disulfide bonds are shown in orange. This figure was prepared with PyMol (www.pymol.org).
Figure 9
Figure 9. Local accuracy assessment of an engineered loop in target T0711
The per-residue accuracy of predictions was evaluated using all-atom lDDT in multi-reference mode against the NMR ensemble (cut-off radius 10 Å, sequence separation of zero). The engineered loop region (residues 3-13) is shaded. Results by all groups are shown in grey with the best loop predictions highlighted in bold. A: Predictions by BAKER-ROSETTASERVER and PconsM; B: Prediction by BhageerathH; C: Prediction by MULTICOM-REFINE.

References

    1. Moult J. The CASP10 experiment - Introduction. Proteins. 2013 al e This volume(This issue)
    1. Kryshtafovych A, Moult J, Bartual SG, Bazan JF, Berman H, Casteel DE, Christodoulou E, Everett JK, Hausmann J, Heidebrecht T, Hills T, Hui R, Hunt JF, Seetharaman J, Joachimiak A, Kennedy MA, Kim C, Lingel A, Michalska K, Montelione GT, Otero JM, Perrakis A, Pizarro JC, van Raaij MJ, Ramelot TA, Rousseau F, Tong L, Wernimont AK, Young J, Schwede T. Target highlights in CASP9: Experimental target structures for the critical assessment of techniques for protein structure prediction. Proteins. 2011;79(Suppl 10):6–20. - PMC - PubMed
    1. Zemla A. LGA: A method for finding 3D similarities in protein structures. Nucleic Acids Res. 2003;31(13):3370–3374. - PMC - PubMed
    1. Mariani V, Biasini M, Barbato A, Schwede T. lDDT: A local superposition-free score for comparing protein structures and models using distance difference tests. Bioinformatics. 2013 DOI: 10.1093/bioinformatics/btt473. - PMC - PubMed
    1. Holm L, Kaariainen S, Wilton C, Plewczynski D. Using Dali for structural comparison of proteins. Curr Protoc Bioinformatics. 2006 Chapter 5:Unit 5 5. - PubMed

Publication types

LinkOut - more resources