Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul 21;12(1):4438.
doi: 10.1038/s41467-021-24773-7.

flDPnn: Accurate intrinsic disorder prediction with putative propensities of disorder functions

Affiliations

flDPnn: Accurate intrinsic disorder prediction with putative propensities of disorder functions

Gang Hu et al. Nat Commun. .

Abstract

Identification of intrinsic disorder in proteins relies in large part on computational predictors, which demands that their accuracy should be high. Since intrinsic disorder carries out a broad range of cellular functions, it is desirable to couple the disorder and disorder function predictions. We report a computational tool, flDPnn, that provides accurate, fast and comprehensive disorder and disorder function predictions from protein sequences. The recent Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment and results on other test datasets demonstrate that flDPnn offers accurate predictions of disorder, fully disordered proteins and four common disorder functions. These predictions are substantially better than the results of the existing disorder predictors and methods that predict functions of disorder. Ablation tests reveal that the high predictive performance stems from innovative ways used in flDPnn to derive sequence profiles and encode inputs. flDPnn's webserver is available at http://biomine.cs.vcu.edu/servers/flDPnn/.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. The architecture of the flDPnn disorder predictor.
Green highlights identify novel elements. Gray boxes denote the disorder prediction and the disorder function prediction modules. DNN (deep neural network); RF (random forest). The example input and outputs correspond to the prediction for the nucleoporin protein from S. cerevisiae (UniProt: P14907; DisProt: DP01077).
Fig. 2
Fig. 2. Comparson of predictive performance between flDPnn and other disorder predictors.
Assessment of the disorder predictions in the CAID experiment (A) and on the test dataset (B and C) and the predictions of the fully disordered proteins in the CAID experiment (D) and on the test dataset (E). AUC (area under the ROC curve); MCC (Matthews correlation coefficient). The predictive quality of the putative disorder propensities is quantified with AUC (green lines) and ROC curves (C). The ROC curves are color-coded where flDPnn is shown in red, ESpritz in violet, SPOT-Disorder-Single in dark green, IUPred2A-short in light green, IUPred2A-long in blue and random predictor in gray. The quality of the binary disorder predictions is assessed with MCC and F1 scores (blue bars). Panels A and D are reproduced from Fig. 2, Table 1, and Supplementary Table 5 from the CAID publication.
Fig. 3
Fig. 3. Ablation analysis of the flDPnn predictor on the test dataset.
Black bars show results for the flDPnn model. Green bars quantify predictive quality where one of the two major innovations (disorder functions predictions in the profile and protein-level feature encoding) is excluded. Gray bars show the performance where one of the architectural elements that are often utilized by the current predictors (residue-level and window-level feature encoding) is excluded. AUC (area under the ROC curve); MCC (Matthews correlation coefficient).
Fig. 4
Fig. 4. Assessment of the quality of the disorder function predictions for the IDRs predicted by flDPnn on the test dataset.
AUC (area under the ROC curve); MCC (Matthews correlation coefficient). The predictive quality for the putative function propensities and binary predictions is quantified with AUC (green lines) and with MCC and F1 scores (blue bars), respectively.

References

    1. Habchi J, Tompa P, Longhi S, Uversky VN. Introducing protein intrinsic disorder. Chem. Rev. 2014;114:6561–6588. doi: 10.1021/cr400514h. - DOI - PubMed
    1. Lieutaud P, et al. How disordered is my protein and what is its disorder for? A guide through the “dark side” of the protein universe. Intrinsically Disord. Proteins. 2016;4:e1259708. doi: 10.1080/21690707.2016.1259708. - DOI - PMC - PubMed
    1. Oldfield, C. J., Uversky, V. N., Dunker, A. K. & Kurgan, L. in Intrinsically Disordered Proteins (ed. Nicola Salvi) 1–34 (Academic Press, 2019).
    1. Peng Z, et al. Exceptionally abundant exceptions: comprehensive characterization of intrinsic disorder in all domains of life. Cell Mol. Life Sci. 2015;72:137–151. doi: 10.1007/s00018-014-1661-9. - DOI - PMC - PubMed
    1. Xue B, Dunker AK, Uversky VN. Orderly order in protein intrinsic disorder distribution: disorder in 3500 proteomes from viruses and the three domains of life. J. Biomol. Struct. Dyn. 2012;30:137–149. doi: 10.1080/07391102.2012.675145. - DOI - PubMed

Publication types

MeSH terms

Substances