A generalized profile syntax for biomolecular sequence motifs and its function in automatic sequence interpretation
- PMID: 7584418
A generalized profile syntax for biomolecular sequence motifs and its function in automatic sequence interpretation
Abstract
A general syntax for expressing biomolecular sequence motifs is described, which will be used in future releases of the PROSITE data bank and in a similar collection of nucleic acid sequence motifs currently under development. The central part of the syntax is a regular structure which can be viewed as a generalization of the profiles introduced by Gribskov and coworkers. Accessory features implement specific motif search strategies and provide information helpful for the interpretation of predicted matches. Two contrasting examples, representing E. coli promoters and SH3 domains respectively, are shown to demonstrate the versatility of the syntax, and its compatibility with diverse motif search methods. It is argued, that a comprehensive machine-readable motif collection based on the new syntax, in conjunction with a standard search program, can serve as a general-purpose sequence interpretation and function prediction tool.
Similar articles
-
A flexible motif search technique based on generalized profiles.Comput Chem. 1996 Mar;20(1):3-23. doi: 10.1016/s0097-8485(96)80003-9. Comput Chem. 1996. PMID: 8867839
-
PdbMotif--a tool for the automatic identification and display of motifs in protein structures.Comput Appl Biosci. 1994 Sep;10(5):545-6. doi: 10.1093/bioinformatics/10.5.545. Comput Appl Biosci. 1994. PMID: 7828071
-
A greedy strategy for finding motifs from yes-no examples.Pac Symp Biocomput. 1996:599-613. Pac Symp Biocomput. 1996. PMID: 9390261
-
Construction and analysis of a profile library characterizing groups of structurally known proteins.Protein Sci. 1996 Oct;5(10):1991-9. doi: 10.1002/pro.5560051005. Protein Sci. 1996. PMID: 8897599 Free PMC article.
-
Generalized-ensemble algorithms for molecular simulations of biopolymers.Biopolymers. 2001;60(2):96-123. doi: 10.1002/1097-0282(2001)60:2<96::AID-BIP1007>3.0.CO;2-F. Biopolymers. 2001. PMID: 11455545 Review.
Cited by
-
Nd6p, a novel protein with RCC1-like domains involved in exocytosis in Paramecium tetraurelia.Eukaryot Cell. 2005 Dec;4(12):2129-39. doi: 10.1128/EC.4.12.2129-2139.2005. Eukaryot Cell. 2005. PMID: 16339730 Free PMC article.
-
The nucleotide sequence of Shiga toxin (Stx) 2e-encoding phage phiP27 is not related to other Stx phage genomes, but the modular genetic structure is conserved.Infect Immun. 2002 Apr;70(4):1896-908. doi: 10.1128/IAI.70.4.1896-1908.2002. Infect Immun. 2002. PMID: 11895953 Free PMC article.
-
Initial characterization of Pf62, a novel protein of Plasmodium falciparum identified by immunoscreening.Parasitol Res. 2009 Jun;104(6):1389-97. doi: 10.1007/s00436-009-1335-y. Epub 2009 Jan 27. Parasitol Res. 2009. PMID: 19172295 Free PMC article.
-
Molecular and functional characterization of a Taenia adhesion gene family (TAF) encoding potential protective antigens of Taenia saginata oncospheres.Parasitol Res. 2007 Feb;100(3):519-28. doi: 10.1007/s00436-006-0297-6. Epub 2006 Oct 18. Parasitol Res. 2007. PMID: 17048003
-
Characterization of the RND family of multidrug efflux pumps: in silico to in vivo confirmation of four functionally distinct subgroups.Microb Biotechnol. 2010 Nov;3(6):691-700. doi: 10.1111/j.1751-7915.2010.00189.x. Microb Biotechnol. 2010. PMID: 21255364 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Other Literature Sources