DomNet: protein domain boundary prediction using enhanced general regression network and new profiles
- PMID: 18556265
- DOI: 10.1109/TNB.2008.2000747
DomNet: protein domain boundary prediction using enhanced general regression network and new profiles
Abstract
The accurate and stable prediction of protein domain boundaries is an important avenue for the prediction of protein structure, function, evolution, and design. Recent research on protein domain boundary prediction has been mainly based on widely known machine learning techniques. In this paper, we propose a new machine learning based domain predictor namely, DomNet that can show a more accurate and stable predictive performance than the existing state-of-the-art models. The DomNet is trained using a novel compact domain profile, secondary structure, solvent accessibility information, and interdomain linker index to detect possible domain boundaries for a target sequence. The performance of the proposed model was compared to nine different machine learning models on the Benchmark_2 dataset in terms of accuracy, sensitivity, specificity, and correlation coefficient. The DomNet achieved the best performance with 71% accuracy for domain boundary identification in multidomains proteins. With the CASP7 benchmark dataset, it again demonstrated superior performance to contemporary domain boundary predictors such as DOMpro, DomPred, DomSSEA, DomCut, and DomainDiscovery.
Similar articles
-
Improved general regression network for protein domain boundary prediction.BMC Bioinformatics. 2008;9 Suppl 1(Suppl 1):S12. doi: 10.1186/1471-2105-9-S1-S12. BMC Bioinformatics. 2008. PMID: 18315843 Free PMC article.
-
Armadillo: domain boundary prediction by amino acid composition.J Mol Biol. 2005 Jul 29;350(5):1061-73. doi: 10.1016/j.jmb.2005.05.037. J Mol Biol. 2005. PMID: 15978619
-
Improving the performance of DomainDiscovery of protein domain boundary assignment using inter-domain linker index.BMC Bioinformatics. 2006 Dec 18;7 Suppl 5(Suppl 5):S6. doi: 10.1186/1471-2105-7-S5-S6. BMC Bioinformatics. 2006. PMID: 17254311 Free PMC article.
-
Potential implications of availability of short amino acid sequences in proteins: an old and new approach to protein decoding and design.Biotechnol Annu Rev. 2008;14:109-41. doi: 10.1016/S1387-2656(08)00004-5. Biotechnol Annu Rev. 2008. PMID: 18606361 Review.
-
Correlated substitution analysis and the prediction of amino acid structural contacts.Brief Bioinform. 2008 Jan;9(1):46-56. doi: 10.1093/bib/bbm052. Epub 2007 Nov 13. Brief Bioinform. 2008. PMID: 18000015 Review.
Cited by
-
PDP-CON: prediction of domain/linker residues in protein sequences using a consensus approach.J Mol Model. 2016 Apr;22(4):72. doi: 10.1007/s00894-016-2933-0. Epub 2016 Mar 11. J Mol Model. 2016. PMID: 26969678 Free PMC article.
-
Extending Protein Domain Boundary Predictors to Detect Discontinuous Domains.PLoS One. 2015 Oct 26;10(10):e0141541. doi: 10.1371/journal.pone.0141541. eCollection 2015. PLoS One. 2015. PMID: 26502173 Free PMC article.
-
ThreaDom: extracting protein domain boundary information from multiple threading alignments.Bioinformatics. 2013 Jul 1;29(13):i247-56. doi: 10.1093/bioinformatics/btt209. Bioinformatics. 2013. PMID: 23812990 Free PMC article.
-
OPUS-Dom: applying the folding-based method VECFOLD to determine protein domain boundaries.J Mol Biol. 2009 Jan 30;385(4):1314-29. doi: 10.1016/j.jmb.2008.10.093. Epub 2008 Nov 10. J Mol Biol. 2009. PMID: 19026662 Free PMC article.
-
A modular kernel approach for integrative analysis of protein domain boundaries.BMC Genomics. 2009 Dec 3;10 Suppl 3(Suppl 3):S21. doi: 10.1186/1471-2164-10-S3-S21. BMC Genomics. 2009. PMID: 19958485 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources