Remote homology detection: a motif based approach
- PMID: 12855434
- DOI: 10.1093/bioinformatics/btg1002
Remote homology detection: a motif based approach
Abstract
Motivation: Remote homology detection is the problem of detecting homology in cases of low sequence similarity. It is a hard computational problem with no approach that works well in all cases.
Results: We present a method for detecting remote homology that is based on the presence of discrete sequence motifs. The motif content of a pair of sequences is used to define a similarity that is used as a kernel for a Support Vector Machine (SVM) classifier. We test the method on two remote homology detection tasks: prediction of a previously unseen SCOP family and prediction of an enzyme class given other enzymes that have a similar function on other substrates. We find that it performs significantly better than an SVM method that uses BLAST or Smith-Waterman similarity scores as features.
Similar articles
-
Mismatch string kernels for discriminative protein classification.Bioinformatics. 2004 Mar 1;20(4):467-76. doi: 10.1093/bioinformatics/btg431. Epub 2004 Jan 22. Bioinformatics. 2004. PMID: 14990442
-
Protein homology detection using string alignment kernels.Bioinformatics. 2004 Jul 22;20(11):1682-9. doi: 10.1093/bioinformatics/bth141. Epub 2004 Feb 26. Bioinformatics. 2004. PMID: 14988126
-
SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition.BMC Bioinformatics. 2007 May 22;8 Suppl 4(Suppl 4):S2. doi: 10.1186/1471-2105-8-S4-S2. BMC Bioinformatics. 2007. PMID: 17570145 Free PMC article.
-
Biological applications of support vector machines.Brief Bioinform. 2004 Dec;5(4):328-38. doi: 10.1093/bib/5.4.328. Brief Bioinform. 2004. PMID: 15606969 Review.
-
Homology-free prediction of functional class of proteins and peptides by support vector machines.Curr Protein Pept Sci. 2008 Feb;9(1):70-95. doi: 10.2174/138920308783565697. Curr Protein Pept Sci. 2008. PMID: 18336324 Review.
Cited by
-
Building blocks and blueprints for bacterial autolysins.PLoS Comput Biol. 2021 Apr 1;17(4):e1008889. doi: 10.1371/journal.pcbi.1008889. eCollection 2021 Apr. PLoS Comput Biol. 2021. PMID: 33793553 Free PMC article.
-
Prediction of alternatively spliced exons using support vector machines.Int J Data Min Bioinform. 2010;4(4):411-30. doi: 10.1504/ijdmb.2010.034197. Int J Data Min Bioinform. 2010. PMID: 20815140 Free PMC article.
-
Protein remote homology detection based on bidirectional long short-term memory.BMC Bioinformatics. 2017 Oct 10;18(1):443. doi: 10.1186/s12859-017-1842-2. BMC Bioinformatics. 2017. PMID: 29017445 Free PMC article.
-
CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction.Bioinformatics. 2016 Jun 15;32(12):i332-i340. doi: 10.1093/bioinformatics/btw271. Bioinformatics. 2016. PMID: 27307635 Free PMC article.
-
Classification of protein sequences by means of irredundant patterns.BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S16. doi: 10.1186/1471-2105-11-S1-S16. BMC Bioinformatics. 2010. PMID: 20122187 Free PMC article.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials