Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Sep 19;55(39):11970-4.
doi: 10.1002/anie.201604788. Epub 2016 Aug 25.

Prediction of Protein Structure Using Surface Accessibility Data

Affiliations

Prediction of Protein Structure Using Surface Accessibility Data

Christoph Hartlmüller et al. Angew Chem Int Ed Engl. .

Abstract

An approach to the de novo structure prediction of proteins is described that relies on surface accessibility data from NMR paramagnetic relaxation enhancements by a soluble paramagnetic compound (sPRE). This method exploits the distance-to-surface information encoded in the sPRE data in the chemical shift-based CS-Rosetta de novo structure prediction framework to generate reliable structural models. For several proteins, it is demonstrated that surface accessibility data is an excellent measure of the correct protein fold in the early stages of the computational folding algorithm and significantly improves accuracy and convergence of the standard Rosetta structure prediction approach.

Keywords: CS-Rosetta; NMR spectroscopy; paramagnetic relaxation; protein structure prediction; structural biology.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Principle of sPRE‐CS‐Rosetta. a) NMR sPRE data provides quantitative and residue specific information on the solvent accessibility as the effect of paramagnetic probes such as Gd(DTPA‐BMA) is distance dependent. b) Back‐calculation of sPRE data relies on placing the protein into equidistantly spaced grid points, while overlapping grid points are removed. The sPRE is approximated by the sum of all contributions of the surrounding grid points. c) The sPRE module is implemented as a scoring function capable of scoring centroid as well as full‐atom models. At its core, the experimental sPRE data (sPREexp) is compared to the predicted sPRE data of the current Rosetta model (sPREcalc) and a score based on the Spearman correlation coefficient (colored numbers) is computed. In this scheme, the sPRE score is used during the folding of the protein backbone using the simplified centroid model as well as for rescoring the final full‐atom models.
Figure 2
Figure 2
sPRE data is an excellent measure of the correct protein fold and improves protein structure prediction. a) Structural ensembles of ubiquitin representing different stages of the AbinitioRelax protocol were rescored using Rosetta centroid and full‐atom scores (orange axis), the sPRE score (blue axis), and the chemical shift score (black axis). Experimental sPRE data for HN and Haliphatic protons were used as input for the sPRE score. b), c) Box plots showing the average Cα‐RMSD to the native structure for models obtained from CS‐Rosetta (orange) and sPRE‐CS‐Rosetta (blue). sPRE data was determined by NMR experiments (b) or back‐calculated (c). All obtained structural models were scored according to the sum of the Rosetta, chemical shift and sPRE score (b) or according to the sum of the Rosetta and the chemical shift score (c). For every protein, the best scored 0.2 % structures of all models were selected and used to generate the box plots. Proteins for which the sampling was improved by the sPRE module (reduced mean RMSD to native structure compared to CS‐Rosetta) are marked with a gray background and proteins for which CS‐Rosetta and sPRE‐CS‐Rosetta failed are not shown (average Cα‐RMSD >10 Å in the case of p16, 1CX1, 1F2 H, 1GXE, 1IX5, 1ON4, 1RFL, 1XWE, 2KNR, 2LFC, 2LFP, 2LLL, 2PQE, 2RRF, 3ZQD, and 4A5V). All scores are shown in arbitrary units.
Figure 3
Figure 3
sPRE data enhances accuracy and convergence of CS‐Rosetta structure prediction. The lowest‐energy models of CS‐Rosetta (orange) and sPRE‐CS‐Rosetta (blue) are compared to the NMR solution structures (gray, PDB code). For both methods, the corresponding Rosetta score (score13_env_hb) is plotted on the left and the distribution of the Cα‐RMSD of the sampled structures is shown below for both methods in a logarithmic histogram. For ubiquitin (a) and the C‐terminal domain of Phl p 5a (b) experimental sPRE data for amide and aliphatic protons is used, and for human prion protein (c) and the P‐type ATPase CopA (d) the input sPRE data was back‐calculated using the lowest energy model. In (a) and (c), the best scored model according to the Rosetta score is shown (see arrow in score plots), and for (b) and (d) the 10 lowest‐energy models are shown. For ubiquitin (a), a red sphere represents the position of the Cβ atom of His 68, indicating the wrong positioning of the β‐strand in the CS‐Rosetta run. A more detailed picture of the scores is shown in the Supporting Information, Figure S3. All scores are shown in arbitrary units.

Similar articles

Cited by

References

    1. None
    1. Göbl C., Madl T., Simon B., Sattler M., Prog. Nucl. Magn. Reson. Spectrosc. 2014, 80, 26–63; - PubMed
    1. Cavanagh J., Protein NMR Spectroscopy: Principles and Practice , 2nd ed., Academic Press, Amsterdam, Boston, 2007;
    1. Rule G. S., Hitchens T. K., Fundamentals of Protein NMR Spectroscopy, Springer, Dordrecht, 2006.
    1. Berman H. M., Westbrook J., Feng Z., Gilliland G., Bhat T. N., Weissig H., Shindyalov I. N., Bourne P. E., Nucleic Acids Res. 2000, 28, 235–242. - PMC - PubMed

Publication types