Toward the detection and validation of repeats in protein structure
- PMID: 15340924
- DOI: 10.1002/prot.20202
Toward the detection and validation of repeats in protein structure
Abstract
We present a method called DAVROS to detect, localize, and validate repeating motifs in protein structure allowing for insertions and deletions. DAVROS uses the score matrix from a structural alignment program (SAP) to search for repeating motifs using an algorithm based on concepts from signal processing and the statistical properties of the alignments. The method was tested against a nonredundant Protein Data Bank, and each chain was assigned a score. For the top 50 chains ranked by score, 70% contain repeating motifs detected without error. These represent 14 types of fold covering alpha, beta, and alphabeta protein classes. A second data set comprising protein chains in different sequence families for triosephosphate isomerase (TIM) barrel, leucine-rich repeat (LRR), trefoil, and alpha-alpha barrel folds was used to assess the ability of DAVROS to detect all motifs within a specific fold. For the second test set, the percentage of motifs detected was highest for the LRR chains (88.7%) and least for the TIM barrels (60%). This variability results from the regularity of the LRR motif compared to the alphabeta units of the TIM barrel, which generally have many more indels. These reduce the strength of the repeat signal in the SAP matrix, making repeat detection more difficult.
Copyright 2004 Wiley-Liss, Inc.
Similar articles
-
Wavelet transforms for the characterization and detection of repeating motifs.J Mol Biol. 2002 Feb 15;316(2):341-63. doi: 10.1006/jmbi.2001.5332. J Mol Biol. 2002. PMID: 11851343
-
Structural principles of leucine-rich repeat (LRR) proteins.Proteins. 2004 Feb 15;54(3):394-403. doi: 10.1002/prot.10605. Proteins. 2004. PMID: 14747988
-
Folding of beta/alpha-unit scrambled forms of S. cerevisiae triosephosphate isomerase: Evidence for autonomy of substructure formation and plasticity of hydrophobic and hydrogen bonding interactions in core of (beta/alpha)8-barrel.Proteins. 2004 May 15;55(3):548-57. doi: 10.1002/prot.20066. Proteins. 2004. PMID: 15103619
-
Comparison of ARM and HEAT protein repeats.J Mol Biol. 2001 May 25;309(1):1-18. doi: 10.1006/jmbi.2001.4624. J Mol Biol. 2001. PMID: 11491282 Review.
-
Comparison of protein repeat classifications based on structure and sequence families.Biochem Soc Trans. 2015 Oct;43(5):832-7. doi: 10.1042/BST20150079. Biochem Soc Trans. 2015. PMID: 26517890 Review.
Cited by
-
ConSole: using modularity of contact maps to locate solenoid domains in protein structures.BMC Bioinformatics. 2014 Apr 27;15:119. doi: 10.1186/1471-2105-15-119. BMC Bioinformatics. 2014. PMID: 24766872 Free PMC article.
-
Parallel-SymD: A Parallel Approach to Detect Internal Symmetry in Protein Domains.Biomed Res Int. 2016;2016:4628592. doi: 10.1155/2016/4628592. Epub 2016 Sep 26. Biomed Res Int. 2016. PMID: 27747230 Free PMC article.
-
Surface antigens and potential virulence factors from parasites detected by comparative genomics of perfect amino acid repeats.Proteome Sci. 2007 Dec 20;5:20. doi: 10.1186/1477-5956-5-20. Proteome Sci. 2007. PMID: 18096064 Free PMC article.
-
Backtracking on the folding landscape of the beta-trefoil protein interleukin-1beta?Proc Natl Acad Sci U S A. 2008 Sep 30;105(39):14844-8. doi: 10.1073/pnas.0807812105. Epub 2008 Sep 19. Proc Natl Acad Sci U S A. 2008. PMID: 18806223 Free PMC article.
-
Systematic detection of internal symmetry in proteins using CE-Symm.J Mol Biol. 2014 May 29;426(11):2255-68. doi: 10.1016/j.jmb.2014.03.010. Epub 2014 Mar 26. J Mol Biol. 2014. PMID: 24681267 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous