Construction and analysis of a profile library characterizing groups of structurally known proteins
- PMID: 8897599
- PMCID: PMC2143267
- DOI: 10.1002/pro.5560051005
Construction and analysis of a profile library characterizing groups of structurally known proteins
Abstract
A new sequence motif library StrProf was constructed characterizing the groups of related proteins in the PDB three-dimensional structure database. For a representative member of each protein family, which was identified by cross-referencing the PDB with the PIR superfamily classification, a group of related sequences was collected by the BLAST search against the nonredundant protein sequence database. For every group, the motifs were identified automatically according to the criteria of conservation and uniqueness of pentapeptide patterns and with a dual dynamic programming algorithm. In the StrProf library, motifs are represented by profile matrices rather than consensus patterns to allow more flexible search capabilities. Another dynamic programming algorithm was then developed to search this motif library. When the computationally derived StrProf was compared with PROSITE, which is a manually derived motif library in the best consensus pattern representation, the numbers of identified patterns were comparable. StrProf missed about one third of the PROSITE motifs, but there were also new motifs lacking in PROSITE. The new library was incorporated in SMART (Sequence Motif Analysis and Retrieval Tool), a computer tool designed to help search and annotate biologically important sites in an unknown protein sequence. The client program is available free of charge through the Internet.
Similar articles
-
Fast model-based protein homology detection without alignment.Bioinformatics. 2007 Jul 15;23(14):1728-36. doi: 10.1093/bioinformatics/btm247. Epub 2007 May 8. Bioinformatics. 2007. PMID: 17488755
-
Designing patterns for profile HMM search.Bioinformatics. 2007 Jan 15;23(2):e36-43. doi: 10.1093/bioinformatics/btl323. Bioinformatics. 2007. PMID: 17237102
-
Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure.J Mol Biol. 2001 Nov 2;313(4):903-19. doi: 10.1006/jmbi.2001.5080. J Mol Biol. 2001. PMID: 11697912
-
Iterated profile searches with PSI-BLAST--a tool for discovery in protein databases.Trends Biochem Sci. 1998 Nov;23(11):444-7. doi: 10.1016/s0968-0004(98)01298-5. Trends Biochem Sci. 1998. PMID: 9852764 Review. No abstract available.
-
Comparative methods for identifying functional domains in protein sequences.Biotechnol Annu Rev. 1995;1:129-47. doi: 10.1016/s1387-2656(08)70050-4. Biotechnol Annu Rev. 1995. PMID: 9704087 Review.
Cited by
-
SCANMOT: searching for similar sequences using a simultaneous scan of multiple sequence motifs.Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W274-6. doi: 10.1093/nar/gki493. Nucleic Acids Res. 2005. PMID: 15980468 Free PMC article.
-
Transcriptional analysis of the tutE tutFDGH gene cluster from Thauera aromatica strain T1.Appl Environ Microbiol. 2000 Mar;66(3):1147-51. doi: 10.1128/AEM.66.3.1147-1151.2000. Appl Environ Microbiol. 2000. PMID: 10698784 Free PMC article.
-
Characterization of the glycoprotein B gene from ruminant alphaherpesviruses.Virus Genes. 2002 Mar;24(2):99-105. doi: 10.1023/a:1014504730475. Virus Genes. 2002. PMID: 12018712
-
Characterization of a Chlamydia psittaci DNA binding protein (EUO) synthesized during the early and middle phases of the developmental cycle.Infect Immun. 1998 Mar;66(3):1167-73. doi: 10.1128/IAI.66.3.1167-1173.1998. Infect Immun. 1998. PMID: 9488410 Free PMC article.
-
Identification of two novel hrp-associated genes in the hrp gene cluster of Xanthomonas oryzae pv. oryzae.J Bacteriol. 2000 Apr;182(7):1844-53. doi: 10.1128/JB.182.7.1844-1853.2000. J Bacteriol. 2000. PMID: 10714988 Free PMC article.
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials