Fast structure alignment for protein databank searching
- PMID: 1409565
- DOI: 10.1002/prot.340140203
Fast structure alignment for protein databank searching
Abstract
A fast method is described for searching and analyzing the protein structure databank. It uses secondary structure followed by residue matching to compare protein structures and is developed from a previous structural alignment method based on dynamic programming. Linear representations of secondary structures are derived and their features compared to identify equivalent elements in two proteins. The secondary structure alignment then constrains the residue alignment, which compares only residues within aligned secondary structures and with similar buried areas and torsional angles. The initial secondary structure alignment improves accuracy and provides a means of filtering out unrelated proteins before the slower residue alignment stage. It is possible to search or sort the protein structure databank very quickly using just secondary structure comparisons. A search through 720 structures with a probe protein of 10 secondary structures required 1.7 CPU hours on a Sun 4/280. Alternatively, combined secondary structure and residue alignments, with a cutoff on the secondary structure score to remove pairs of unrelated proteins from further analysis, took 10.1 CPU hours. The method was applied in searches on different classes of proteins and to cluster a subset of the databank into structurally related groups. Relationships were consistent with known families of protein structure.
Similar articles
-
An integrated approach to the analysis and modeling of protein sequences and structures. III. A comparative study of sequence conservation in protein structural families using multiple structural alignments.J Mol Biol. 2000 Aug 18;301(3):691-711. doi: 10.1006/jmbi.2000.3975. J Mol Biol. 2000. PMID: 10966778
-
Alignment and searching for common protein folds using a data bank of structural templates.J Mol Biol. 1993 Jun 5;231(3):735-52. doi: 10.1006/jmbi.1993.1323. J Mol Biol. 1993. PMID: 8515448
-
A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence.J Mol Biol. 1997 Apr 11;267(4):1026-38. doi: 10.1006/jmbi.1997.0924. J Mol Biol. 1997. PMID: 9135128
-
[A turning point in the knowledge of the structure-function-activity relations of elastin].J Soc Biol. 2001;195(2):181-93. J Soc Biol. 2001. PMID: 11727705 Review. French.
-
Scores for sequence searches and alignments.Curr Opin Struct Biol. 1996 Jun;6(3):353-60. doi: 10.1016/s0959-440x(96)80055-8. Curr Opin Struct Biol. 1996. PMID: 8804821 Review.
Cited by
-
A revised set of potentials for beta-turn formation in proteins.Protein Sci. 1994 Dec;3(12):2207-16. doi: 10.1002/pro.5560031206. Protein Sci. 1994. PMID: 7756980 Free PMC article.
-
The blind watchmaker and rational protein engineering.J Biotechnol. 1994 Aug 31;36(3):185-220. doi: 10.1016/0168-1656(94)90152-x. J Biotechnol. 1994. PMID: 7765263 Free PMC article. Review.
-
Classification of protein disulphide-bridge topologies.J Comput Aided Mol Des. 2001 May;15(5):477-87. doi: 10.1023/a:1011164224144. J Comput Aided Mol Des. 2001. PMID: 11394740
-
A rapid classification protocol for the CATH Domain Database to support structural genomics.Nucleic Acids Res. 2001 Jan 1;29(1):223-7. doi: 10.1093/nar/29.1.223. Nucleic Acids Res. 2001. PMID: 11125098 Free PMC article.
-
Modelling of peptide and protein structures.Amino Acids. 1994 Jun;7(2):175-202. doi: 10.1007/BF00814159. Amino Acids. 1994. PMID: 24186049
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources