Construction of a dictionary of sequence motifs that characterize groups of related proteins
- PMID: 1438158
- PMCID: PMC7528547
- DOI: 10.1093/protein/5.6.479
Construction of a dictionary of sequence motifs that characterize groups of related proteins
Abstract
An automatic procedure is proposed to identify, from the protein sequence database, conserved amino acid patterns (or sequence motifs) that are exclusive to a group of functionally related proteins. This procedure is applied to the PIR database and a dictionary of sequence motifs that relate to specific superfamilies constructed. The motifs have a practical relevance in identifying the membership of specific superfamilies without the need to perform sequence database searches in 20% of newly determined sequences. The sequence motifs identified represent functionally important sites on protein molecules. When multiple blocks exist in a single motif they are often close together in the 3-D structure. Furthermore, occasionally these motif blocks were found to be split by introns when the correlation with exon structures was examined.
Similar articles
-
ProClass Protein Family Database.Nucleic Acids Res. 1999 Jan 1;27(1):272-4. doi: 10.1093/nar/27.1.272. Nucleic Acids Res. 1999. PMID: 9847199 Free PMC article.
-
Dictionary building via unsupervised hierarchical motif discovery in the sequence space of natural proteins.Proteins. 1999 Nov 1;37(2):264-77. doi: 10.1002/(sici)1097-0134(19991101)37:2<264::aid-prot11>3.0.co;2-c. Proteins. 1999. PMID: 10584071
-
The CATH Dictionary of Homologous Superfamilies (DHS): a consensus approach for identifying distant structural homologues.Protein Eng. 2000 Mar;13(3):153-65. doi: 10.1093/protein/13.3.153. Protein Eng. 2000. PMID: 10775657
-
GCN5-related N-acetyltransferases: a structural overview.Annu Rev Biophys Biomol Struct. 2000;29:81-103. doi: 10.1146/annurev.biophys.29.1.81. Annu Rev Biophys Biomol Struct. 2000. PMID: 10940244 Free PMC article. Review.
-
[A turning point in the knowledge of the structure-function-activity relations of elastin].J Soc Biol. 2001;195(2):181-93. J Soc Biol. 2001. PMID: 11727705 Review. French.
Cited by
-
Structural organization of virulence-associated plasmids of Yersinia pestis.J Bacteriol. 1998 Oct;180(19):5192-202. doi: 10.1128/JB.180.19.5192-5202.1998. J Bacteriol. 1998. PMID: 9748454 Free PMC article.
-
Starch- and glycogen-debranching and branching enzymes: prediction of structural features of the catalytic (beta/alpha)8-barrel domain and evolutionary relationship to other amylolytic enzymes.J Protein Chem. 1993 Dec;12(6):791-805. doi: 10.1007/BF01024938. J Protein Chem. 1993. PMID: 8136030
-
WildSpan: mining structured motifs from protein sequences.Algorithms Mol Biol. 2011 Mar 31;6(1):6. doi: 10.1186/1748-7188-6-6. Algorithms Mol Biol. 2011. PMID: 21453542 Free PMC article.
-
MAGIIC-PRO: detecting functional signatures by efficient discovery of long patterns in protein sequences.Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W356-61. doi: 10.1093/nar/gkl309. Nucleic Acids Res. 2006. Corrected and republished in: Nucleic Acids Res. 2008 Mar;36(4):1400-6. doi: 10.1093/nar/gkm717. PMID: 16845025 Free PMC article. Corrected and republished.
-
The PROSITE database, its status in 1995.Nucleic Acids Res. 1996 Jan 1;24(1):189-96. doi: 10.1093/nar/24.1.189. Nucleic Acids Res. 1996. PMID: 8594577 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources