Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Sep;78(9):897-911.
doi: 10.1177/00037028241239977. Epub 2024 Apr 22.

Reference Data Set for Circular Dichroism Spectroscopy Comprised of Validated Intrinsically Disordered Protein Models

Affiliations

Reference Data Set for Circular Dichroism Spectroscopy Comprised of Validated Intrinsically Disordered Protein Models

Gabor Nagy et al. Appl Spectrosc. 2024 Sep.

Abstract

Circular dichroism (CD) spectroscopy is an analytical technique that measures the wavelength-dependent differential absorbance of circularly polarized light and is applicable to most biologically important macromolecules, such as proteins, nucleic acids, and carbohydrates. It serves to characterize the secondary structure composition of proteins, including intrinsically disordered proteins, by analyzing their recorded spectra. Several computational tools have been developed to interpret protein CD spectra. These methods have been calibrated and tested mostly on globular proteins with well-defined structures, mainly due to the lack of reliable reference structures for disordered proteins. It is therefore still largely unclear how accurately these computational methods can determine the secondary structure composition of disordered proteins. Here, we provide such a required reference data set consisting of model structural ensembles and matching CD spectra for eight intrinsically disordered proteins. Using this set of data, we have assessed the accuracy of several published CD prediction and secondary structure estimation tools, including our own CD analysis package, SESCA. Our results show that for most of the tested methods, their accuracy for disordered proteins is generally lower than for globular proteins. In contrast, SESCA, which was developed using globular reference proteins, but was designed to be applicable to disordered proteins as well, performs similarly well for both classes of proteins. The new reference data set for disordered proteins should allow for further improvement of all published methods.

Keywords: CD; CD prediction; Intrinsically disordered proteins; circular dichroism spectroscopy; protein ensemble refinement; reference data set; secondary structure estimation.

PubMed Disclaimer

Conflict of interest statement

Declaration of Conflicting InterestsThe authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Figures

Figure 1.
Figure 1.
IDP8 protein ensemble models. Each ensemble model is an overlay of 20–50 backbone conformations, shown in cartoon representation, and fitted to the first model of the respective ensemble. The name of each ensemble model is displayed above the model. Group A models were previously published and were obtained from the PED, Group B models were derived by the authors using NMR chemical shifts, SAXS, and CD measurements. Group C models were derived similarly to the models of Group B but without using CD information.
Figure 2.
Figure 2.
Measured IDP8 CDspectra. The spectra of eight different IDP domains are shown in different colors. Abbreviations for the name of each domain are shown in the upper right corner (color-coded) and are listed in Table I. The full name of each IDP domain is listed in the Reference data set assembly section of this paper. Intensities of the CD spectra are expressed in 1000 mean residue ellipticity units (kMRE or 1000 deg* cm2/dmol). The dotted gray line indicates the CD intensity of 0 kMRE.
Figure 3.
Figure 3.
Accuracy of CD spectrum predictions. Summary of RMSDs of CD spectra predicted from reference model structures relative to measured spectra of the same protein. Shown are RMSD values averaged over all proteins, for the different methods described in the text. Two RDSs have been used: IDP8 for disordered proteins (blue) and SP175 for folded globular proteins (orange). Tested CD prediction methods are DichroCalc, PDBMD2CD, and SESCA with four different basis sets (DS-dTSC3, DSSP-1SC3, HBSS-3SC1, and DS5-4SC1).
Figure 4.
Figure 4.
Accuracy of SS fraction estimates. Summary of averaged RMSDs of SS fractions estimated from the reference CD spectra by different methods relative to SS fractions computed from the respective reference structure. As in Figure 3, two RDSs have been used: IDP8 for disordered proteins (blue), and SP175 for folded globular proteins (orange). The tested SS fraction estimators are K2D3, BESTSEL, and SESCA_Bayes with four different basis sets (DS-dTSC3, DSSP-1SC3, HBSS-3SC1, and DS5-4SC1).

Similar articles

Cited by

References

    1. Manavalan P., Johnson W.C.. “Protein Secondary Structure from Circular Dichroism Spectra”. J. Biosci. 1985. 8(1–2): 141–149. 10.1007/BF02703972 - DOI - PubMed
    1. Whitmore L., Wallace B.A.. “Protein Secondary Structure Analyses from Circular Dichroism Spectroscopy: Methods and Reference Databases”. Biopolymers. 2008. 89(5): 392–400. 10.1002/bip.20853 - DOI - PubMed
    1. Wallace B.A.. “Protein Characterisation by Synchrotron Radiation Circular Dichroism Spectroscopy”. Q. Rev. Biophys. 2009. 42(4): 317–370. 10.1017/s003358351000003x - DOI - PubMed
    1. Fasman G.D.. Circular Dichroism and the Conformational Analysis of Biomolecules. Boston, MA: Springer, 1996.
    1. Bulheller B.M., Hirst J.D.. “DichroCalc: Circular and Linear Dichroism Online”. Bioinformatics. 2009. 25(4): 539–540. 10.1093/bioinformatics/btp016 - DOI - PubMed

Substances

LinkOut - more resources