The PROSECCO server for chemical shift predictions in ordered and disordered proteins

Máximo Sanz-Hernández¹, Alfonso De Simone²

Affiliations

¹ Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK.
² Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK. adesimon@imperial.ac.uk.

PMID: 29119515
PMCID: PMC5711976
DOI: 10.1007/s10858-017-0145-2

The PROSECCO server for chemical shift predictions in ordered and disordered proteins

Máximo Sanz-Hernández et al. J Biomol NMR. 2017 Nov.

. 2017 Nov;69(3):147-156.

doi: 10.1007/s10858-017-0145-2. Epub 2017 Nov 8.

Authors

Máximo Sanz-Hernández¹, Alfonso De Simone²

Affiliations

¹ Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK.
² Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK. adesimon@imperial.ac.uk.

PMID: 29119515
PMCID: PMC5711976
DOI: 10.1007/s10858-017-0145-2

Abstract

The chemical shifts measured in solution-state and solid-state nuclear magnetic resonance (NMR) are powerful probes of the structure and dynamics of protein molecules. The exploitation of chemical shifts requires methods to correlate these data with the protein structures and sequences. We present here an approach to calculate accurate chemical shifts in both ordered and disordered proteins using exclusively the information contained in their sequences. Our sequence-based approach, protein sequences and chemical shift correlations (PROSECCO), achieves the accuracy of the most advanced structure-based methods in the characterization of chemical shifts of folded proteins and improves the state of the art in the study of disordered proteins. Our analyses revealed fundamental insights on the structural information carried by NMR chemical shifts of structured and unstructured protein states.

Keywords: Biomolecular NMR; Chemical shift predictions; Disordered proteins.

PubMed Disclaimer

Figures

**Fig. 1**
Sequence-based prediction of CS in IDPs. Root mean square deviations (RMSDs) are reported between experimental and predicted CS using PROSECCO_IDP (cyan) and the methods by Tamiola et al. (2010) (red) and Kjaergaard and Poulsen (2011) (yellow). In the case of PROSECCO_IDP, the benchmark was performed using a “leave-one-out” approach, whereby when a BMRB entry is used to calculate the RMSD between experimental and calculated CS, the method is reparameterized by excluding this entry from the parameterizing dataset. The leave-one-out benchmark has been rotated on all the BMRB entries employed in PROSECCO_IDP. In the two other programs tested, the benchmark was performed on the whole dataset of BMRB entries used in PROSECCO_IDP. A web server for the PROSECCO method is available at http://desimone.bio.ic.ac.uk/prosecco/

**Fig. 2**
Gaussian-kernel prediction of chemical shifts in folded proteins classified in Q3 regions. The benchmark compares the performance of structure-based methods such as SPARTA+ (Shen and Bax 2010) and CamShift (Kohlhoff et al. 2009) with the prediction of CS using the Gaussian-kernels in indexed Q3 regions (helices, strands and coils) of the protein sequence. The dataset for this benchmark included 77 BMRB entries of structured proteins that were deposited from 2016 onwards (see Table S4 for the list BMRB entries and the corresponding PDB codes). a Benchmark performed including the whole protein sequences. b Benchmark performed by discarding two residues from each termini of the Q3 regions. A dissection of the accuracy in the different Q3 types is reported in Fig. S4

**Fig. 3**
Secondary shifts in boundary regions between Q3 segments. The example of the boundary region between α-helixes and loops is shown. Bars report the average secondary shifts as a function of the distance from the boundary between the two Q3 segments, with error bars showing the standard deviations. a ¹³Cα secondary shifts gradually morph from the typical values adopted in α-helixes (+ 3.2 ppm) to those of loop regions (0.0 ppm). b Backbone amide ¹⁵N secondary shifts, however, exhibits anomalous trends. Starting from the typical values of α-helixes (− 0.95 ppm), the secondary shifts augment to a maximum value of − 3.0 ppm in correspondence of the last residue of the α-helixes to subsequently inverting toward positive values, with a maximum reached at the position 6 of the loop, and to finally fading to 0.0 ppm

**Fig. 4**
PROSECCO_FOLDED Scheme of the structure-free prediction of PROSECCO_FOLDED As example, a local segment of sequence “NQNNF” is used to illustrate how the prediction of the chemical shifts of the atoms in the central asparagine is generated. In the step 1, the protein sequence is analyzed using psipred (Jones 1999) to predict the secondary structure profile that provides the estimation of the Q3 regions (helixes, strands and coils) of the protein. In steps 2 and 3 the kernel-based prediction is applied to obtain CS tables in each of the Q3 segments, including the corrections of the boundary regions (step 4). The combination of all the prediction terms generates the output CS values (step 5)

**Fig. 5**
Accuracy of PROSECCO_FOLDED coupled with an artificial neural network. In its final version, PROSECCO_FOLDED was coupled with an artificial neural network to minimize the uncertainty introduced by using predicted secondary structure elements to index the Q3 regions. RMSD values between experimental and calculated CS showed that the sequence-based approach defined in this way reached a similar accuracy of predictors relying on structural similarity criteria such as SPARTA+ (Shen and Bax 2010) and a better performance than methods employing first principle approaches to analyze protein structures such as Camshift (Kohlhoff et al. 2009)

See this image and copyright information in PMC

References

1. Berjanskii MV, Wishart DS. A simple method to predict protein flexibility using secondary chemical shifts. J Am Chem Soc. 2005;127:14970–14971. doi: 10.1021/ja054842f. - DOI - PubMed
1. Berjanskii MV, Wishart DS. A simple method to measure protein side-chain mobility using NMR chemical shifts. J Am Chem Soc. 2013;135:14536–14539. doi: 10.1021/ja407509z. - DOI - PubMed
1. Berjanskii M, Arndt D, Liang Y, Wishart DS. A robust algorithm for optimizing protein structures with NMR chemical shifts. J Biomol NMR. 2015;63:255–264. doi: 10.1007/s10858-015-9982-z. - DOI - PubMed
1. Berry EA, Dalby AR, Yang ZR. Reduced bio basis function neural network for identification of protein phosphorylation sites: comparison with pattern recognition algorithms. Comput Biol Chem. 2004;28:75–85. doi: 10.1016/j.compbiolchem.2003.11.005. - DOI - PubMed
1. Boulton S, Akimoto M, Selvaratnam R, Bashiri A, Melacini G. A tool set to map allosteric networks through the NMR chemical shift covariance analysis. Sci Rep. 2014;4:7306. doi: 10.1038/srep07306. - DOI - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The PROSECCO server for chemical shift predictions in ordered and disordered proteins

Affiliations

The PROSECCO server for chemical shift predictions in ordered and disordered proteins

Authors

Affiliations

Abstract

Figures

References

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources