Gleaning non-trivial structural, functional and evolutionary information about proteins by iterative database searches
- PMID: 10222208
- DOI: 10.1006/jmbi.1999.2653
Gleaning non-trivial structural, functional and evolutionary information about proteins by iterative database searches
Abstract
Using a number of diverse protein families as test cases, we investigate the ability of the recently developed iterative sequence database search method, PSI-BLAST, to identify subtle relationships between proteins that originally have been deemed detectable only at the level of structure-structure comparison. We show that PSI-BLAST can detect many, though not all, of such relationships, but the success critically depends on the optimal choice of the query sequence used to initiate the search. Generally, there is a correlation between the diversity of the sequences detected in the first pass of database screening and the ability of a given query to detect subtle relationships in subsequent iterations. Accordingly, a thorough analysis of protein superfamilies at the sequence level is necessary in order to maximize the chances of gleaning non-trivial structural and functional inferences, as opposed to a single search, initiated, for example, with the sequence of a protein whose structure is available. This strategy is illustrated by several findings, each of which involves an unexpected structural prediction: (i) a number of previously undetected proteins with the HSP70-actin fold are identified, including a highly conserved and nearly ubiquitous family of metal-dependent proteases (typified by bacterial O-sialoglycoprotease) that represent an adaptation of this fold to a new type of enzymatic activity; (ii) we show that, contrary to the previous conclusions, ATP-dependent and NAD-dependent DNA ligases are confidently predicted to possess the same fold; (iii) the C-terminal domain of 3-phosphoglycerate dehydrogenase, which binds serine and is involved in allosteric regulation of the enzyme activity, is shown to typify a new superfamily of ligand-binding, regulatory domains found primarily in enzymes and regulators of amino acid and purine metabolism; (iv) the immunoglobulin-like DNA-binding domain previously identified in the structures of transcription factors NFkappaB and NFAT is shown to be a member of a distinct superfamily of intracellular and extracellular domains with the immunoglobulin fold; and (v) the Rag-2 subunit of the V-D-J recombinase is shown to contain a kelch-type beta-propeller domain which rules out its evolutionary relationship with bacterial transposases.
Copyright 1999 Academic Press.
Similar articles
-
Expanding the nitrogen regulatory protein superfamily: Homology detection at below random sequence identity.Proteins. 2002 Jul 1;48(1):75-84. doi: 10.1002/prot.10110. Proteins. 2002. PMID: 12012339
-
Homology between O-linked GlcNAc transferases and proteins of the glycogen phosphorylase superfamily.J Mol Biol. 2001 Nov 30;314(3):365-74. doi: 10.1006/jmbi.2001.5151. J Mol Biol. 2001. PMID: 11846551
-
An integrated approach to the analysis and modeling of protein sequences and structures. III. A comparative study of sequence conservation in protein structural families using multiple structural alignments.J Mol Biol. 2000 Aug 18;301(3):691-711. doi: 10.1006/jmbi.2000.3975. J Mol Biol. 2000. PMID: 10966778
-
Searching protein structure databases has come of age.Proteins. 1994 Jul;19(3):165-73. doi: 10.1002/prot.340190302. Proteins. 1994. PMID: 7937731 Review.
-
Contemporary approaches to protein structure classification.Bioessays. 1998 Nov;20(11):884-91. doi: 10.1002/(SICI)1521-1878(199811)20:11<884::AID-BIES3>3.0.CO;2-H. Bioessays. 1998. PMID: 9872054 Review.
Cited by
-
A new model for allosteric regulation of phenylalanine hydroxylase: implications for disease and therapeutics.Arch Biochem Biophys. 2013 Feb 15;530(2):73-82. doi: 10.1016/j.abb.2012.12.017. Epub 2013 Jan 11. Arch Biochem Biophys. 2013. PMID: 23296088 Free PMC article.
-
Structural and mechanistic conservation in DNA ligases.Nucleic Acids Res. 2000 Nov 1;28(21):4051-8. doi: 10.1093/nar/28.21.4051. Nucleic Acids Res. 2000. PMID: 11058099 Free PMC article. Review.
-
Identification of differentially expressed cDNA sequences in ovaries of sexual and apomictic plants of Brachiaria brizantha.Plant Mol Biol. 2003 Dec;53(6):745-57. doi: 10.1023/B:PLAN.0000023664.21910.bd. Plant Mol Biol. 2003. PMID: 15082923
-
Prediction of transcription regulatory sites in Archaea by a comparative genomic approach.Nucleic Acids Res. 2000 Feb 1;28(3):695-705. doi: 10.1093/nar/28.3.695. Nucleic Acids Res. 2000. PMID: 10637320 Free PMC article.
-
Quod erat demonstrandum? The mystery of experimental validation of apparently erroneous computational analyses of protein sequences.Genome Biol. 2001;2(12):RESEARCH0051. doi: 10.1186/gb-2001-2-12-research0051. Epub 2001 Nov 13. Genome Biol. 2001. PMID: 11790254 Free PMC article.
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials