Extracting protein alignment models from the sequence database
- PMID: 9108146
- PMCID: PMC146639
- DOI: 10.1093/nar/25.9.1665
Extracting protein alignment models from the sequence database
Abstract
Biologists often gain structural and functional insights into a protein sequence by constructing a multiple alignment model of the family. Here a program called Probe fully automates this process of model construction starting from a single sequence. Central to this program is a powerful new method to locate and align only those, often subtly, conserved patterns essential to the family as a whole. When applied to randomly chosen proteins, Probe found on average about four times as many relationships as a pairwise search and yielded many new discoveries. These include: an obscure subfamily of globins in the roundworm Caenorhabditis elegans ; two new superfamilies of metallohydrolases; a lipoyl/biotin swinging arm domain in bacterial membrane fusion proteins; and a DH domain in the yeast Bud3 and Fus2 proteins. By identifying distant relationships and merging families into superfamilies in this way, this analysis further confirms the notion that proteins evolved from relatively few ancient sequences. Moreover, this method automatically generates models of these ancient conserved regions for rapid and sensitive screening of sequences.
Similar articles
-
HomologyPlot: searching for homology to a family of proteins using a database of unique conserved patterns.J Comput Aided Mol Des. 1994 Apr;8(2):193-210. doi: 10.1007/BF00119867. J Comput Aided Mol Des. 1994. PMID: 8064334
-
An integrated approach to the analysis and modeling of protein sequences and structures. III. A comparative study of sequence conservation in protein structural families using multiple structural alignments.J Mol Biol. 2000 Aug 18;301(3):691-711. doi: 10.1006/jmbi.2000.3975. J Mol Biol. 2000. PMID: 10966778
-
Pfam: a comprehensive database of protein domain families based on seed alignments.Proteins. 1997 Jul;28(3):405-20. doi: 10.1002/(sici)1097-0134(199707)28:3<405::aid-prot10>3.0.co;2-l. Proteins. 1997. PMID: 9223186
-
Sequence similarity analysis of Escherichia coli proteins: functional and evolutionary implications.Proc Natl Acad Sci U S A. 1995 Dec 5;92(25):11921-5. doi: 10.1073/pnas.92.25.11921. Proc Natl Acad Sci U S A. 1995. PMID: 8524875 Free PMC article.
-
Structural divergence and distant relationships in proteins: evolution of the globins.Curr Opin Struct Biol. 2005 Jun;15(3):290-301. doi: 10.1016/j.sbi.2005.05.008. Curr Opin Struct Biol. 2005. PMID: 15922591 Review.
Cited by
-
Structural effects of the active site mutation cysteine to serine in Bacillus cereus zinc-beta-lactamase.Protein Sci. 2000 Jul;9(7):1402-6. doi: 10.1110/ps.9.7.1402. Protein Sci. 2000. PMID: 10933508 Free PMC article.
-
The COG database: an updated version includes eukaryotes.BMC Bioinformatics. 2003 Sep 11;4:41. doi: 10.1186/1471-2105-4-41. Epub 2003 Sep 11. BMC Bioinformatics. 2003. PMID: 12969510 Free PMC article.
-
The glycine brace: a component of Rab, Rho, and Ran GTPases associated with hinge regions of guanine- and phosphate-binding loops.BMC Struct Biol. 2009 Mar 5;9:11. doi: 10.1186/1472-6807-9-11. BMC Struct Biol. 2009. PMID: 19265520 Free PMC article.
-
Histone deacetylases, acetoin utilization proteins and acetylpolyamine amidohydrolases are members of an ancient protein superfamily.Nucleic Acids Res. 1997 Sep 15;25(18):3693-7. doi: 10.1093/nar/25.18.3693. Nucleic Acids Res. 1997. PMID: 9278492 Free PMC article.
-
Minimum information for reporting next generation sequence genotyping (MIRING): Guidelines for reporting HLA and KIR genotyping via next generation sequencing.Hum Immunol. 2015 Dec;76(12):954-62. doi: 10.1016/j.humimm.2015.09.011. Epub 2015 Sep 25. Hum Immunol. 2015. PMID: 26407912 Free PMC article.
References
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases