Review

. 2020 Dec 4;10(12):1636.

doi: 10.3390/biom10121636.

Comparative Assessment of Intrinsic Disorder Predictions with a Focus on Protein and Nucleic Acid-Binding Proteins

Akila Katuwawala¹, Lukasz Kurgan¹

Affiliations

PMID: 33291838
PMCID: PMC7762010
DOI: 10.3390/biom10121636

Review

Comparative Assessment of Intrinsic Disorder Predictions with a Focus on Protein and Nucleic Acid-Binding Proteins

Akila Katuwawala et al. Biomolecules. 2020.

. 2020 Dec 4;10(12):1636.

doi: 10.3390/biom10121636.

Authors

Akila Katuwawala¹, Lukasz Kurgan¹

Affiliation

¹ Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA.

PMID: 33291838
PMCID: PMC7762010
DOI: 10.3390/biom10121636

Abstract

With over 60 disorder predictors, users need help navigating the predictor selection task. We review 28 surveys of disorder predictors, showing that only 11 include assessment of predictive performance. We identify and address a few drawbacks of these past surveys. To this end, we release a novel benchmark dataset with reduced similarity to the training sets of the considered predictors. We use this dataset to perform a first-of-its-kind comparative analysis that targets two large functional families of disordered proteins that interact with proteins and with nucleic acids. We show that limiting sequence similarity between the benchmark and the training datasets has a substantial impact on predictive performance. We also demonstrate that predictive quality is sensitive to the use of the well-annotated order and inclusion of the fully structured proteins in the benchmark datasets, both of which should be considered in future assessments. We identify three predictors that provide favorable results using the new benchmark set. While we find that VSL2B offers the most accurate and robust results overall, ESpritz-DisProt and SPOT-Disorder perform particularly well for disordered proteins. Moreover, we find that predictions for the disordered protein-binding proteins suffer low predictive quality compared to generic disordered proteins and the disordered nucleic acids-binding proteins. This can be explained by the high disorder content of the disordered protein-binding proteins, which makes it difficult for the current methods to accurately identify ordered regions in these proteins. This finding motivates the development of a new generation of methods that would target these difficult-to-predict disordered proteins. We also discuss resources that support users in collecting and identifying high-quality disorder predictions.

Keywords: intrinsic disorder; intrinsically disordered proteins; prediction; predictive performance; protein-nucleic acids interactions; protein-protein interactions.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Figure 1**
Chronological summary of the past surveys of the intrinsic disorder and intrinsic disorder function predictors.

**Figure 2**
Comparison of the predictive quality measured with AUC (panel A; solid lines) and MCC (panel B; dashed lines). We report results on the *new benchmark* (in green; dataset with <30% sequence similarity to the training proteins + with experimental validation of structured regions + with fully structured proteins), based on recent *previous reports* (in black; datasets with no limits on sequence similarity to the training proteins + with no experimental validation of structured regions + with only disordered proteins), and based on a *similarity-limited benchmark* (in red; a version of the new benchmark dataset with <30% sequence similarity to the training proteins + no experimental validation of structured regions + only disordered proteins). The latter dataset is a proxy for the datasets used in prior studies, with the only difference being the reduced similarity to the training proteins. Disorder predictors are sorted by their AUC values on the new benchmark dataset.

**Figure 3**
Comparison of the predictive quality measured with AUC (panel A; solid lines) and MCC (panel B; dashed lines). We report results on the generic set of disordered proteins (i.e., proteins that have disordered residues) from benchmark dataset (in black), the disordered protein-binding proteins (in orange), and the disordered nucleic-acid-binding proteins (in blue). Disorder predictors are sorted by their AUC values on the disordered proteins.

**Figure 4**
Summary of the empirical comparative results.

See this image and copyright information in PMC

Cited by

Comparative evaluation of AlphaFold2 and disorder predictors for prediction of intrinsic disorder, disorder content and fully disordered proteins.
Zhao B, Ghadermarzi S, Kurgan L. Zhao B, et al. Comput Struct Biotechnol J. 2023 Jun 2;21:3248-3258. doi: 10.1016/j.csbj.2023.06.001. eCollection 2023. Comput Struct Biotechnol J. 2023. PMID: 38213902 Free PMC article.
Capturing a Crucial 'Disorder-to-Order Transition' at the Heart of the Coronavirus Molecular Pathology-Triggered by Highly Persistent, Interchangeable Salt-Bridges.
Roy S, Ghosh P, Bandyopadhyay A, Basu S. Roy S, et al. Vaccines (Basel). 2022 Feb 16;10(2):301. doi: 10.3390/vaccines10020301. Vaccines (Basel). 2022. PMID: 35214759 Free PMC article.
flDPnn: Accurate intrinsic disorder prediction with putative propensities of disorder functions.
Hu G, Katuwawala A, Wang K, Wu Z, Ghadermarzi S, Gao J, Kurgan L. Hu G, et al. Nat Commun. 2021 Jul 21;12(1):4438. doi: 10.1038/s41467-021-24773-7. Nat Commun. 2021. PMID: 34290238 Free PMC article.
Alternative Mechanisms for DNA Engagement by BET Bromodomain-Containing Proteins.
Kalra P, Zahid H, Ayoub A, Dou Y, Pomerantz WCK. Kalra P, et al. Biochemistry. 2022 Jul 5;61(13):1260-1272. doi: 10.1021/acs.biochem.2c00157. Epub 2022 Jun 24. Biochemistry. 2022. PMID: 35748495 Free PMC article.
Deep learning in prediction of intrinsic disorder in proteins.
Zhao B, Kurgan L. Zhao B, et al. Comput Struct Biotechnol J. 2022 Mar 8;20:1286-1294. doi: 10.1016/j.csbj.2022.03.003. eCollection 2022. Comput Struct Biotechnol J. 2022. PMID: 35356546 Free PMC article. Review.

See all "Cited by" articles

References

1. Lieutaud P., Ferron F., Uversky A.V., Kurgan L., Uversky V.N., Longhi S. How disordered is my protein and what is its disorder for? A guide through the “dark side” of the protein universe. Intrinsically Disord. Proteins. 2016;4:e1259708. doi: 10.1080/21690707.2016.1259708. - DOI - PMC - PubMed
1. Habchi J., Tompa P., Longhi S., Uversky V.N. Introducing Protein Intrinsic Disorder. Chem. Rev. 2014;114:6561–6588. doi: 10.1021/cr400514h. - DOI - PubMed
1. Oldfield C.J., Uversky V.N., Dunker A.K., Kurgan L. Introduction to intrinsically disordered proteins and regions. In: Salvi N., editor. Intrinsically Disordered Proteins. Academic Press; Cambridge, MA, USA: 2019. pp. 1–34. - DOI
1. Babu M.M. The contribution of intrinsically disordered regions to protein function, cellular complexity, and human disease. Biochem. Soc. Trans. 2016;44:1185–1200. doi: 10.1042/BST20160172. - DOI - PMC - PubMed
1. Dunker A.K., Silman I., Uversky V.N., Sussman J.L. Function and structure of inherently disordered proteins. Curr. Opin. Struct. Biol. 2008;18:756–764. doi: 10.1016/j.sbi.2008.10.002. - DOI - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions
Actions

Grants and funding

1617369/National Science Foundation/International

LinkOut - more resources

Full Text Sources
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Comparative Assessment of Intrinsic Disorder Predictions with a Focus on Protein and Nucleic Acid-Binding Proteins

Affiliation

Comparative Assessment of Intrinsic Disorder Predictions with a Focus on Protein and Nucleic Acid-Binding Proteins

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous