Detecting patterns in protein sequences
- PMID: 8014990
- DOI: 10.1006/jmbi.1994.1407
Detecting patterns in protein sequences
Abstract
The detection of conserved sequence patterns (motifs) in related proteins often yields valuable structural and functional insights. We describe a method that utilizes rigorous statistics and a depth-first search procedure to efficiently and exhaustively search a set of proteins for significant patterns up to a specified length. Additional procedures classify related patterns into groups and identify protein segments most likely to share a common motif. The utility of the method was demonstrated on several difficult test problems; detection of motifs among 56 proteins in the acyltransferase family, detection of a dinucleotide-binding fold present within a small subset of a set of 91 distantly related and unrelated proteins, detection of the helix-turn-helix motif in 15 distantly related proteins and detection of subtle internal repeats in a prenyltransferase. In a search of a large set of sequences for internal repeats, the method detected novel ankyrin-like repeats in an Escherichia coli protein.
Similar articles
-
Gibbs motif sampling: detection of bacterial outer membrane protein repeats.Protein Sci. 1995 Aug;4(8):1618-32. doi: 10.1002/pro.5560040820. Protein Sci. 1995. PMID: 8520488 Free PMC article.
-
An integrated approach to the analysis and modeling of protein sequences and structures. III. A comparative study of sequence conservation in protein structural families using multiple structural alignments.J Mol Biol. 2000 Aug 18;301(3):691-711. doi: 10.1006/jmbi.2000.3975. J Mol Biol. 2000. PMID: 10966778
-
Finding flexible patterns in unaligned protein sequences.Protein Sci. 1995 Aug;4(8):1587-95. doi: 10.1002/pro.5560040817. Protein Sci. 1995. PMID: 8520485 Free PMC article.
-
Designing patterns for profile HMM search.Bioinformatics. 2007 Jan 15;23(2):e36-43. doi: 10.1093/bioinformatics/btl323. Bioinformatics. 2007. PMID: 17237102
-
Comparison of ARM and HEAT protein repeats.J Mol Biol. 2001 May 25;309(1):1-18. doi: 10.1006/jmbi.2001.4624. J Mol Biol. 2001. PMID: 11491282 Review.
Cited by
-
HEAT repeats associated with condensins, cohesins, and other complexes involved in chromosome-related functions.Genome Res. 2000 Oct;10(10):1445-52. doi: 10.1101/gr.147400. Genome Res. 2000. PMID: 11042144 Free PMC article.
-
WildSpan: mining structured motifs from protein sequences.Algorithms Mol Biol. 2011 Mar 31;6(1):6. doi: 10.1186/1748-7188-6-6. Algorithms Mol Biol. 2011. PMID: 21453542 Free PMC article.
-
Fast and accurate discovery of degenerate linear motifs in protein sequences.PLoS One. 2014 Sep 10;9(9):e106081. doi: 10.1371/journal.pone.0106081. eCollection 2014. PLoS One. 2014. PMID: 25207816 Free PMC article.
-
Estimation and efficient computation of the true probability of recurrence of short linear protein sequence motifs in unrelated proteins.BMC Bioinformatics. 2010 Jan 7;11:14. doi: 10.1186/1471-2105-11-14. BMC Bioinformatics. 2010. PMID: 20055997 Free PMC article.
-
A structural study for the optimisation of functional motifs encoded in protein sequences.BMC Bioinformatics. 2004 Apr 30;5:50. doi: 10.1186/1471-2105-5-50. BMC Bioinformatics. 2004. PMID: 15119965 Free PMC article.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases