Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Feb 15:7:4.
doi: 10.1186/1748-7188-7-4.

TS-AMIR: a topology string alignment method for intensive rapid protein structure comparison

Affiliations

TS-AMIR: a topology string alignment method for intensive rapid protein structure comparison

Jafar Razmara et al. Algorithms Mol Biol. .

Abstract

Background: In structural biology, similarity analysis of protein structure is a crucial step in studying the relationship between proteins. Despite the considerable number of techniques that have been explored within the past two decades, the development of new alternative methods is still an active research area due to the need for high performance tools.

Results: In this paper, we present TS-AMIR, a Topology String Alignment Method for Intensive Rapid comparison of protein structures. The proposed method works in two stages: In the first stage, the method generates a topology string based on the geometric details of secondary structure elements, and then, utilizes an n-gram modelling technique over entropy concept to capture similarities in these strings. This initial correspondence map between secondary structure elements is submitted to the second stage in order to obtain the alignment at the residue level. Applying the Kabsch method, a heuristic step-by-step algorithm is adopted in the second stage to align the residues, resulting in an optimal rotation matrix and minimized RMSD. The performance of the method was assessed in different information retrieval tests and the results were compared with those of CE and TM-align, representing two geometrical tools, and YAKUSA, 3D-BLAST and SARST as three representatives of linear encoding schemes. It is shown that the method obtains a high running speed similar to that of the linear encoding schemes. In addition, the method runs about 800 and 7200 times faster than TM-align and CE respectively, while maintaining a competitive accuracy with TM-align and CE.

Conclusions: The experimental results demonstrate that linear encoding techniques are capable of reaching the same high degree of accuracy as that achieved by geometrical methods, while generally running hundreds of times faster than conventional programs.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Secondary structure elements representation as vectors in 3D-space. Dashed vectors represent inter-SSEs vectors.
Figure 2
Figure 2
A typical example for secondary structure modelling in a topology string.
Figure 3
Figure 3
The algorithm for secondary structure matching using topology strings.
Figure 4
Figure 4
An example for matching topology string of two reference proteins with 24 permuted topology strings of query protein.
Figure 5
Figure 5
Average TMscore obtained at different gap penalties.
Figure 6
Figure 6
Average precision-recall for searching 108 query proteins.
Figure 7
Figure 7
Retrieval effectiveness on different structural categories.
Figure 8
Figure 8
Retrieval effectiveness on low sequence identity.

Similar articles

Cited by

References

    1. Shibuya T, Jansson J, Sadakane K. Linear-time protein 3-D structure searching with insertions and deletions. Algorithms for Molecular Biology. 2010;5:7. doi: 10.1186/1748-7188-5-7. - DOI - PMC - PubMed
    1. Holm L, Sander C. Protein structure comparison by alignment of distance matrices. Journal of Molecular Biology. 1993;233:123–138. doi: 10.1006/jmbi.1993.1489. - DOI - PubMed
    1. Gibrat JF, Madej T, Spouge JL, Bryant SH. The VAST protein structure comparison method. Biophysics Jounnal. 1997;72:MP298.
    1. Shindyalov I, Bourne P. Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Engineering. 1998;11:739–47. doi: 10.1093/protein/11.9.739. - DOI - PubMed
    1. Ortiz AR, Strauss CE, Olmea O. MAMMOTH (matching molecular models obtained from theory): an automated method for model comparison. Protein Science. 2002;11:2606–2621. - PMC - PubMed