. 2018 Apr 12;122(14):3920-3930.

doi: 10.1021/acs.jpcb.8b01763. Epub 2018 Mar 29.

Accurately Predicting Disordered Regions of Proteins Using Rosetta ResidueDisorder Application

Stephanie S Kim¹, Justin T Seffernick¹, Steffen Lindert¹

Affiliations

PMID: 29595057
PMCID: PMC5897131
DOI: 10.1021/acs.jpcb.8b01763

Accurately Predicting Disordered Regions of Proteins Using Rosetta ResidueDisorder Application

Stephanie S Kim et al. J Phys Chem B. 2018.

. 2018 Apr 12;122(14):3920-3930.

doi: 10.1021/acs.jpcb.8b01763. Epub 2018 Mar 29.

Authors

Stephanie S Kim¹, Justin T Seffernick¹, Steffen Lindert¹

Affiliation

¹ Department of Chemistry and Biochemistry , Ohio State University , Columbus , Ohio 43210 , United States.

PMID: 29595057
PMCID: PMC5897131
DOI: 10.1021/acs.jpcb.8b01763

Abstract

Although many proteins necessitate well-folded structures to properly instigate their biological functions, a large fraction of functioning proteins contain regions-known as intrinsically disordered protein regions-where stable structures are not likely to form. Notable functional roles of intrinsically disordered proteins are in transcriptional regulation, translation, and cellular signal transduction. Moreover, intrinsically disordered protein regions are highly abundant in many proteins associated with various human diseases, therefore these segments have become attractive drug targets for potential therapeutics. Over the past decades, numerous computational methods have been developed to accurately predict disordered regions of proteins. Here we introduce a user-friendly and reliable approach for the prediction of disordered protein regions using the structure prediction software Rosetta. Using 245 proteins from a benchmark data set (16 DisProt database proteins) and a test data set (229 proteins with NMR data), we use Rosetta to predict the global protein structures and then show that there is a statistically significant difference between Rosetta scores in disordered and ordered regions, with scores being less favorable in disordered regions. Furthermore, the difference in scores between ordered and disordered protein regions is sufficient to accurately identify disordered protein regions. As a result, our Rosetta ResidueDisorder method (benchmark data set prediction accuracy of 71.77% and independent test data set prediction accuracy of 65.37%) outperformed other established disorder prediction tools and did not exhibit a biased prediction toward either ordered or disordered regions. To facilitate usage, a Rosetta application has been developed for the Rosetta ResidueDisorder method.

PubMed Disclaimer

Figures

**Figure 1. Correlation between degree of disorder and size-normalized per-residue Rosetta score**
A positive correlation is observed between the degree of disorder (fraction of disordered residues in each protein) of 16 benchmark dataset proteins and Rosetta score per residue of each protein. As the fraction of disordered residues increases, the size-normalized per-residue Rosetta score also increases (i.e. becomes less favorable).

**Figure 2. Comparison of individual Rosetta order scores of disordered and ordered residues**
This figure compares the distributions of the order score (defined as the window-averaged per-residue Rosetta scores) of all disordered and ordered residues in the 16 protein benchmark dataset. The two extreme points (maximum and minimum), mean, and the median are illustrated as tick marks.

**Figure 3. Optimization of terminal residues prediction of the benchmark dataset**
Rosetta ResidueDisorder disorder predictions of all 16 benchmark dataset proteins are shown. The blue data points are Rosetta order scores calculated at a window size of 11 residues. Residues with order scores above the cutoff line (red line) are predicted as disordered residues, while residues with order scores below the cutoff line are predicted as ordered residues. The cutoff line is slopped for terminal residues in proteins with less than 60% predicted disordered residues using a flat cutoff line at -1.0 REU. The cutoff values are increased for the terminal 13% of the protein sequence with a maximum cutoff of -0.3 REU.

**Figure 4. Comparison of 6 prediction tools’ accuracy on 16 benchmark set proteins**
The bar graph compares the average percent accuracy of 5 different IDP categories (0% disordered proteins (blue); 30% disordered proteins (yellow); 50% disordered proteins (green); 70% disordered proteins (red); 100% disordered proteins (purple)) for each 6 prediction tools. The error bar represents the standard deviation for each IDP category. IDP 50% bar graphs do not have error bars, because the IDP 50% category contained only one protein. A biased prediction accuracy can be observed toward long-length disordered regions for PONDR VL3-H and Meta-Disorder, and a biased prediction accuracy toward ordered regions for PrDOS, IUPred, DISOPRED, and MFDp2. Compared to the other tools, the Rosetta Residue Disorder method shows consistent prediction accuracy throughout all levels of disorder.

**Figure 5. ROC curve analysis of the benchmark dataset**
The ROC curves of 6 different prediction tools are shown: Rosetta ResidueDisorder (blue); IUPred (orange); PrDOS (green); PONDR VL3-H (red); MFDp2 (purple); Meta-Disorder (brown); DISOPRED (pink). AUCs are shown in the legend.

**Figure 6. Comparison of 6 prediction tools’ average prediction accuracy of the test dataset**
The bar graph compares the average percent accuracy of 5 different IDP categories (0% disordered proteins (blue); 30% disordered proteins (yellow); 50% disordered proteins (green); 70% disordered proteins (red); 100% disordered proteins (purple)) for each 6 prediction tools. The error bar represents the standard deviation for each IDP category. The bar graph clearly illustrates a biased prediction accuracy toward long-length disordered regions for PONDR VL3-H and Meta-Disorder, and a biased prediction accuracy toward ordered proteins for PrDOS, IUPred, DISOPRED, and MFDp2. Compared to the other tools, the Rosetta ResidueDisorder method shows consistent prediction accuracy throughout all levels of disorder.

**Figure 7. ROC curve analysis of the test dataset**
The ROC curves of 6 different prediction tools are shown: Rosetta ResidueDisorder (blue); IUPred (orange); PrDOS (green); PONDR VL3-H (red); MFDp2 (purple); Meta-Disorder (brown); DISOPRED (pink). AUCs are shown in the legend.

See this image and copyright information in PMC

Cited by

Predicting substitutions to modulate disorder and stability in coiled-coils.
Karami Y, Saighi P, Vanderhaegen R, Gerlier D, Longhi S, Laine E, Carbone A. Karami Y, et al. BMC Bioinformatics. 2020 Dec 21;21(Suppl 19):573. doi: 10.1186/s12859-020-03867-x. BMC Bioinformatics. 2020. PMID: 33349244 Free PMC article.
Protein shape sampled by ion mobility mass spectrometry consistently improves protein structure prediction.
Turzo SMBA, Seffernick JT, Rolland AD, Donor MT, Heinze S, Prell JS, Wysocki VH, Lindert S. Turzo SMBA, et al. Nat Commun. 2022 Jul 28;13(1):4377. doi: 10.1038/s41467-022-32075-9. Nat Commun. 2022. PMID: 35902583 Free PMC article.
Computational Structure Prediction for Antibody-Antigen Complexes From Hydrogen-Deuterium Exchange Mass Spectrometry: Challenges and Outlook.
Tran MH, Schoeder CT, Schey KL, Meiler J. Tran MH, et al. Front Immunol. 2022 May 26;13:859964. doi: 10.3389/fimmu.2022.859964. eCollection 2022. Front Immunol. 2022. PMID: 35720345 Free PMC article. Review.
Investigating In Situ Expression of c-MYC and Candidate Ubiquitin-Specific Proteases in DLBCL and Assessment for Peptidyl Disruptor Molecule against c-MYC-USP37 Complex.
Kamran DES, Hussain M, Mirza T. Kamran DES, et al. Molecules. 2023 Mar 7;28(6):2441. doi: 10.3390/molecules28062441. Molecules. 2023. PMID: 36985413 Free PMC article.
Predicting ion mobility collision cross sections using projection approximation with ROSIE-PARCS webserver.
Turzo SMBA, Seffernick JT, Lyskov S, Lindert S. Turzo SMBA, et al. Brief Bioinform. 2023 Sep 20;24(5):bbad308. doi: 10.1093/bib/bbad308. Brief Bioinform. 2023. PMID: 37609950 Free PMC article.

See all "Cited by" articles

References

1. Berman HM, Battistuz T, Bhat TN, Bluhm WF, Bourne PE, Burkhardt K, Feng Z, Gilliland GL, Iype L, Jain S, et al. The Protein Data Bank. Acta Crystallogr D Biol Crystallogr. 2002;58(Pt 6 No 1):899–907. - PubMed
1. Berman HM, Coimbatore Narayanan B, Di Costanzo L, Dutta S, Ghosh S, Hudson BP, Lawson CL, Peisach E, Prlić A, Rose PW, et al. Trendspotting in the Protein Data Bank. FEBS Lett. 2013;587(8):1036–1045. - PMC - PubMed
1. Uversky VN. Introduction to intrinsically disordered proteins (IDPs) Chem Rev. 2014;114(13):6557–6560. - PubMed
1. van der Lee R, Buljan M, Lang B, Weatheritt RJ, Daughdrill GW, Dunker AK, Fuxreiter M, Gough J, Gsponer J, Jones DT, et al. Classification of intrinsically disordered regions and proteins. Chem Rev. 2014;114(13):6589–6631. - PMC - PubMed
1. Wright PE, Dyson HJ. Intrinsically disordered proteins in cellular signalling and regulation. Nat Rev Mol Cell Biol. 2015;16(1):18–29. - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Accurately Predicting Disordered Regions of Proteins Using Rosetta ResidueDisorder Application

Affiliation

Accurately Predicting Disordered Regions of Proteins Using Rosetta ResidueDisorder Application

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials

Miscellaneous