Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jan 18;11 Suppl 1(Suppl 1):S34.
doi: 10.1186/1471-2105-11-S1-S34.

Efficient protein alignment algorithm for protein search

Affiliations

Efficient protein alignment algorithm for protein search

Zaixin Lu et al. BMC Bioinformatics. .

Abstract

Background: Proteins show a great variety of 3D conformations, which can be used to infer their evolutionary relationship and to classify them into more general groups; therefore protein structure alignment algorithms are very helpful for protein biologists. However, an accurate alignment algorithm itself may be insufficient for effective discovering of structural relationships among tens of thousands of proteins. Due to the exponentially increasing amount of protein structural data, a fast and accurate structure alignment tool is necessary to access protein classification and protein similarity search; however, the complexity of current alignment algorithms are usually too high to make a fully alignment-based classification and search practical.

Results: We have developed an efficient protein pairwise alignment algorithm and applied it to our protein search tool, which aligns a query protein structure in the pairwise manner with all protein structures in the Protein Data Bank (PDB) to output similar protein structures. The algorithm can align hundreds of pairs of protein structures in one second. Given a protein structure, the tool efficiently discovers similar structures from tens of thousands of structures stored in the PDB always in 2 minutes in a single machine and 20 seconds in our cluster of 6 machines. The algorithm has been fully implemented and is accessible online at our webserver, which is supported by a cluster of computers.

Conclusion: Our algorithm can work out hundreds of pairs of protein alignments in one second. Therefore, it is very suitable for protein search. Our experimental results show that it is more accurate than other well known protein search systems in finding proteins which are structurally similar at SCOP family and superfamily levels, and its speed is also competitive with those systems. In terms of the pairwise alignment performance, it is as good as some well known alignment algorithms.

PubMed Disclaimer

Figures

Figure 2
Figure 2
Precision and recall curves. Figure 2 shows the accuracy performance of multiple protein search methods. The left shows the precisions and recall rates of 108 queries by multiple methods at SCOP family level, and the right shows those of 129 queries by the same methods at SCOP superfamily level.
Figure 1
Figure 1
Q-score difference plots. Figure 1 shows the Q-score Difference between our algorithm and CE [12], Dali [37], and SSM [8], respectively.
Figure 3
Figure 3
Flowchart. Figure 3 is a flowchart of our alignment algorithm.

Similar articles

Cited by

References

    1. Levitt M. Growth of novel protein structural data. Proceedings of the National Academy of Sciences of the United States of America. 2007;104:3183–3188. doi: 10.1073/pnas.0611678104. - DOI - PMC - PubMed
    1. Chew LP, Kedem K, Huttenlocher DP, Kleinberg J. Fast detection of geometric substructure in proteins. J of Computational Biology. 1999;6(3-4):313–325. doi: 10.1089/106652799318292. - DOI - PubMed
    1. Falicov A, Cohen FE. A surface of minimum area metric for the structureal comparison of protein. Journal of Mol Biol. 1996;258:871–892. doi: 10.1006/jmbi.1996.0294. - DOI - PubMed
    1. Fischer D, Nussinov R, Wolfson H. 3D substructure matching in protein molecules. Proc 3rd Intl Symp Combinatorial Pattern Matching, Lecture Notes in Computer Science. 1992;644:136–150.
    1. Holm L, Sander C. Protein structure comparison by alignment of distance matrices. J Mol Biol. 1993;233:123–138. doi: 10.1006/jmbi.1993.1489. - DOI - PubMed

Publication types

LinkOut - more resources