Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2006 Dec 18;7 Suppl 5(Suppl 5):S6.
doi: 10.1186/1471-2105-7-S5-S6.

Improving the performance of DomainDiscovery of protein domain boundary assignment using inter-domain linker index

Affiliations
Comparative Study

Improving the performance of DomainDiscovery of protein domain boundary assignment using inter-domain linker index

Abdur R Sikder et al. BMC Bioinformatics. .

Abstract

Background: Knowledge of protein domain boundaries is critical for the characterisation and understanding of protein function. The ability to identify domains without the knowledge of the structure--by using sequence information only--is an essential step in many types of protein analyses. In this present study, we demonstrate that the performance of DomainDiscovery is improved significantly by including the inter-domain linker index value for domain identification from sequence-based information. Improved DomainDiscovery uses a Support Vector Machine (SVM) approach and a unique training dataset built on the principle of consensus among experts in defining domains in protein structure. The SVM was trained using a PSSM (Position Specific Scoring Matrix), secondary structure, solvent accessibility information and inter-domain linker index to detect possible domain boundaries for a target sequence.

Results: Improved DomainDiscovery is compared with other methods by benchmarking against a structurally non-redundant dataset and also CASP5 targets. Improved DomainDiscovery achieves 70% accuracy for domain boundary identification in multi-domains proteins.

Conclusion: Improved DomainDiscovery compares favourably to the performance of other methods and excels in the identification of domain boundaries for multi-domain proteins as a result of introducing support vector machine with benchmark_2 dataset.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Example of Input Array for a Window Size of 21. True boundary here is the residue that SCOP defines as a boundary residue. In case of positive example, we select 10 residues from each side of the true boundary residue and for negative example; we randomly select 21 residues from the rest of the protein chain.
Figure 2
Figure 2
Six-fold cross-validation results for DomainDiscovery (in average). Six-fold cross-validation results for DomainDiscovery; determined by using SVM classifiers for the Benchmark_2 dataset.
Figure 3
Figure 3
Six-fold cross-validation results for Improved DomainDiscovery (in average). Six-fold cross-validation results for Improved DomainDiscovery; determined by using SVM classifiers for the Benchmark_2 dataset.
Figure 4
Figure 4
Percentage of Correct Prediction by different methods. Percentage of correct predictions for the number of domains defined for each method. If the boundary falls within 30% of the SCOP value the prediction is counted as correct.
Figure 5
Figure 5
An Example of Domain Boundary Assignment by Improved DomainDiscovery. Domain Boundary is at the residue 73.

Similar articles

Cited by

References

    1. Suyama M, Ohara O. DomCut: prediction of inter-domain linker regions in amino acid sequences. Bioinformatics. 2003;19:673–674. doi: 10.1093/bioinformatics/btg031. - DOI - PubMed
    1. Kong L, Ranganathan S. Delineation of modular proteins: Domain boundary prediction from sequence information. Briefings in Bioinformatics. 2004;5:179–192. doi: 10.1093/bib/5.2.179. - DOI - PubMed
    1. Holland TA, Veretnik S, Shindyalov IN, Bourne PE. Partitioning Protein Structures into Domains: Why Is It so Difficult? J Mol Biol. 2006;361:562–590. doi: 10.1016/j.jmb.2006.05.060. - DOI - PubMed
    1. Sikder AR, Zomaya AY. An overview of protein folding techniques: issues and perspectives. International Journal of Bioinformatics Research and Applications. 2005;1:121–143. doi: 10.1504/IJBRA.2005.006911. - DOI - PubMed
    1. Veretnik S, Shindyalov IN. In: Computational Methods for Domain Partitioning in Protein Structures" in Computational Methods for Protein Structure and Modeling. Xu Y, Xu D, Liang J, editor. Springer-Verlag; 2006. - PubMed

Publication types

LinkOut - more resources