Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Feb;19(2):349-56.
doi: 10.1002/pro.303.

De novo structure generation using chemical shifts for proteins with high-sequence identity but different folds

Affiliations

De novo structure generation using chemical shifts for proteins with high-sequence identity but different folds

Yang Shen et al. Protein Sci. 2010 Feb.

Abstract

Proteins with high-sequence identity but very different folds present a special challenge to sequence-based protein structure prediction methods. In particular, a 56-residue three-helical bundle protein (GA(95)) and an alpha/beta-fold protein (GB(95)), which share 95% sequence identity, were targets in the CASP-8 structure prediction contest. With only 12 out of 300 submitted server-CASP8 models for GA(95) exhibiting the correct fold, this protein proved particularly challenging despite its small size. Here, we demonstrate that the information contained in NMR chemical shifts can readily be exploited by the CS-Rosetta structure prediction program and yields adequate convergence, even when input chemical shifts are limited to just amide (1)H(N) and (15)N or (1)H(N) and (1)H(alpha) values.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Amino acid sequences of GAwt, GBwt, and their variants. The secondary structure of GAwt and GBwt, as identified by DSSP for GA88 (PDB entry 2JWS) and GB88 (2JWU), is indicated at the top and bottom of the figure, respectively. Residues that exhibit high-local disorder in the experimental NMR structures (>0.5 Å backbone atom rmsd for the tripeptides centered at this residue) are italicized. Residues that are changed from their wild-type sequences are highlighted in red and cyan for the variants of GAwt and GBwt, respectively. The unique amino acids in the variant pairs of GA95 and GB95, GA88 and GB88 are highlighted in yellow and green, respectively.
Figure 2
Figure 2
Quality of Rosetta/CS-Rosetta fragments used as input for deriving GA and GB models, shown as plots of the lowest (lines with dots) and average (bold lines) backbone coordinate rmsd's (N, Cα, and C′) between any given segment in the experimental structure and 200 nine-residue (upper panel)/three-residue (lower panel) fragments, as a function of starting position of the query segment. Results from the standard Rosetta fragment selection method are plotted in black, whereas those selected using the standard MFR method with chemical shifts are displayed in red. (A) GAwt; (B) GBwt; (C) GA88; (D) GB88; (E) GA95; and (F) GB95. Note that for nine-residue fragments, the last residue starting number in the 56-residue protein is 48, whereas for three-residue fragments, the last starting position is 54.
Figure 3
Figure 3
CS-Rosetta structure generation for proteins GAwt, GBwt and variants GA88/95 and GB88/95. (A–F) Plot of Rosetta all-atom energy, rescored by using the input chemical shifts, versus Cα rmsd relative to the experimental structure, for all CS-Rosetta models of proteins GAwt (A), GBwt (B), GA88 (C), GB88 (D), GA95 (E), and GB95 (F). Following the protocol of Shen et al., for all models the residues identified as disordered based on their RCI-derived order parameter (e.g., 1–8 and 52–56 in GA88) are excluded from the calculation of the Cα rmsd and from the Rosetta energy during model selection. Backbone ribbon representation of the lowest-energy CS-Rosetta structure (red) superimposed on the experimental structure (blue) of proteins is shown at the lower right corner of each panel. (A′–F′) Analogous plots of Rosetta all-atom energy, rescored by using the input chemical shifts (δ1Hα and δ1HN only), for the CS-Rosetta models obtained when using only 1H chemical shifts.

References

    1. Burley SK. An overview of structural genomics. Nat Struct Biol. 2000;7:932–934. - PubMed
    1. Marti-Renom MA, Stuart AC, Fiser A, Sanchez R, Melo F, Sali A. Comparative protein structure modeling of genes and genomes. Annu Rev Biophys Biomol Struct. 2000;29:291–325. - PubMed
    1. Domingues FS, Lackner P, Andreeva A, Sippl MJ. Structure-based evaluation of sequence comparison and fold recognition alignment accuracy. J Mol Biol. 2000;297:1003–1013. - PubMed
    1. Baker D, Sali A. Protein structure prediction and structural genomics. Science. 2001;294:93–96. - PubMed
    1. Das R, Baker D. Macromolecular modeling with Rosetta. Annu Rev Biochem. 2008;77:363–382. - PubMed

Publication types

LinkOut - more resources