Length-dependent prediction of protein intrinsic disorder
- PMID: 16618368
- PMCID: PMC1479845
- DOI: 10.1186/1471-2105-7-208
Length-dependent prediction of protein intrinsic disorder
Abstract
Background: Due to the functional importance of intrinsically disordered proteins or protein regions, prediction of intrinsic protein disorder from amino acid sequence has become an area of active research as witnessed in the 6th experiment on Critical Assessment of Techniques for Protein Structure Prediction (CASP6). Since the initial work by Romero et al. (Identifying disordered regions in proteins from amino acid sequences, IEEE Int. Conf. Neural Netw., 1997), our group has developed several predictors optimized for long disordered regions (>30 residues) with prediction accuracy exceeding 85%. However, these predictors are less successful on short disordered regions (< or =30 residues). A probable cause is a length-dependent amino acid compositions and sequence properties of disordered regions.
Results: We proposed two new predictor models, VSL2-M1 and VSL2-M2, to address this length-dependency problem in prediction of intrinsic protein disorder. These two predictors are similar to the original VSL1 predictor used in the CASP6 experiment. In both models, two specialized predictors were first built and optimized for short (< or = 30 residues) and long disordered regions (>30 residues), respectively. A meta predictor was then trained to integrate the specialized predictors into the final predictor model. As the 10-fold cross-validation results showed, the VSL2 predictors achieved well-balanced prediction accuracies of 81% on both short and long disordered regions. Comparisons over the VSL2 training dataset via 10-fold cross-validation and a blind-test set of unrelated recent PDB chains indicated that VSL2 predictors were significantly more accurate than several existing predictors of intrinsic protein disorder.
Conclusion: The VSL2 predictors are applicable to disordered regions of any length and can accurately identify the short disordered regions that are often misclassified by our previous disorder predictors. The success of the VSL2 predictors further confirmed the previously observed differences in amino acid compositions and sequence properties between short and long disordered regions, and justified our approaches for modelling short and long disordered regions separately. The VSL2 predictors are freely accessible for non-commercial use at http://www.ist.temple.edu/disprot/predictorVSL2.php.
Figures






Similar articles
-
Optimizing long intrinsic disorder predictors with protein evolutionary information.J Bioinform Comput Biol. 2005 Feb;3(1):35-60. doi: 10.1142/s0219720005000886. J Bioinform Comput Biol. 2005. PMID: 15751111
-
Exploiting heterogeneous sequence properties improves prediction of protein disorder.Proteins. 2005;61 Suppl 7:176-182. doi: 10.1002/prot.20735. Proteins. 2005. PMID: 16187360
-
FoldUnfold: web server for the prediction of disordered regions in protein chain.Bioinformatics. 2006 Dec 1;22(23):2948-9. doi: 10.1093/bioinformatics/btl504. Epub 2006 Oct 4. Bioinformatics. 2006. PMID: 17021161
-
Natively disordered proteins: functions and predictions.Appl Bioinformatics. 2004;3(2-3):105-13. doi: 10.2165/00822942-200403020-00005. Appl Bioinformatics. 2004. PMID: 15693736 Review.
-
Five hierarchical levels of sequence-structure correlation in proteins.Appl Bioinformatics. 2004;3(2-3):97-104. doi: 10.2165/00822942-200403020-00004. Appl Bioinformatics. 2004. PMID: 15693735 Review.
Cited by
-
Free cysteine modulates the conformation of human C/EBP homologous protein.PLoS One. 2012;7(4):e34680. doi: 10.1371/journal.pone.0034680. Epub 2012 Apr 4. PLoS One. 2012. PMID: 22496840 Free PMC article.
-
Discovering putative prion sequences in complete proteomes using probabilistic representations of Q/N-rich domains.BMC Genomics. 2013 May 10;14:316. doi: 10.1186/1471-2164-14-316. BMC Genomics. 2013. PMID: 23663289 Free PMC article.
-
Intrinsically disordered regions of p53 family are highly diversified in evolution.Biochim Biophys Acta. 2013 Apr;1834(4):725-38. doi: 10.1016/j.bbapap.2013.01.012. Epub 2013 Jan 22. Biochim Biophys Acta. 2013. PMID: 23352836 Free PMC article.
-
The mammalian cholesterol synthesis enzyme squalene monooxygenase is proteasomally truncated to a constitutively active form.J Biol Chem. 2021 Jan-Jun;296:100731. doi: 10.1016/j.jbc.2021.100731. Epub 2021 Apr 30. J Biol Chem. 2021. PMID: 33933449 Free PMC article.
-
Intrinsic Disorder in Transmembrane Proteins: Roles in Signaling and Topology Prediction.PLoS One. 2016 Jul 8;11(7):e0158594. doi: 10.1371/journal.pone.0158594. eCollection 2016. PLoS One. 2016. PMID: 27391701 Free PMC article.
References
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials