Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Aug 31:9:358.
doi: 10.1186/1471-2105-9-358.

TOPS++FATCAT: fast flexible structural alignment using constraints derived from TOPS+ Strings Model

Affiliations

TOPS++FATCAT: fast flexible structural alignment using constraints derived from TOPS+ Strings Model

Mallika Veeramalai et al. BMC Bioinformatics. .

Abstract

Background: Protein structure analysis and comparison are major challenges in structural bioinformatics. Despite the existence of many tools and algorithms, very few of them have managed to capture the intuitive understanding of protein structures developed in structural biology, especially in the context of rapid database searches. Such intuitions could help speed up similarity searches and make it easier to understand the results of such analyses.

Results: We developed a TOPS++FATCAT algorithm that uses an intuitive description of the proteins' structures as captured in the popular TOPS diagrams to limit the search space of the aligned fragment pairs (AFPs) in the flexible alignment of protein structures performed by the FATCAT algorithm. The TOPS++FATCAT algorithm is faster than FATCAT by more than an order of magnitude with a minimal cost in classification and alignment accuracy. For beta-rich proteins its accuracy is better than FATCAT, because the TOPS+ strings models contains important information of the parallel and anti-parallel hydrogen-bond patterns between the beta-strand SSEs (Secondary Structural Elements). We show that the TOPS++FATCAT errors, rare as they are, can be clearly linked to oversimplifications of the TOPS diagrams and can be corrected by the development of more precise secondary structure element definitions.

Software availability: The benchmark analysis results and the compressed archive of the TOPS++FATCAT program for Linux platform can be downloaded from the following web site: http://fatcat.burnham.org/TOPS/ CONCLUSION: TOPS++FATCAT provides FATCAT accuracy and insights into protein structural changes at a speed comparable to sequence alignments, opening up a possibility of interactive protein structure similarity searches.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Different representations of the protein structure flavodoxin-fold CheY: (a) ribbon diagram; (b) TOPS style topology diagram; (c) distance; (d) contact map.
Figure 2
Figure 2
Rigid versus flexible alignment of aligned fragment pairs (AFPs).
Figure 3
Figure 3
(a) TOPS+ graph model, (b) TOPS+ strings model, and (c) TOPS+ strings matches between Dihydropteridine reductase from rat (1dhr) and human (1hdr). All the conserved TOPS+ strings elements are shown with pink arrows. Dotted arrows indicate matched helices and strands, plain arrows indicate matched loops, and arrows with double lines indicate matched ligand-interacting loops.
Figure 4
Figure 4
The schematic illustration of FATCAT structural alignment by chaining AFPs in a constrained alignment region defined by TOPS alignment output. (a) In FATCAT, two fragments form an AFP (shown as a line in the graph) according to the criteria (see text). (b) The alignment of secondary structure elements from TOPS+ comparison is used to define the constrained area for AFP detection, in which each two aligned secondary structure elements defines an "eligible" block (shown as filled squares). These blocks may be disconnected, and we need to connect them with connecting blocks (shown as open squares). (c) We add a buffer area surrounding the constrained area defined in (b) (shown as the area closed by dashed lines) to get the constrained alignment region for FATCAT alignment (show as the area closed by dark lines). (d) Only those AFPs within the constrained alignment region are used in the dynamic programming algorithm for chaining.
Figure 5
Figure 5
The ROC curve analysis results based on P-values obtained from flexible and rigid options from the FATCAT and TOPS++FATCAT methods, where rF-pvalue and fF-pvalue indicate rigid and flexible FATCAT methods, respectively; similarly, rT2F-pavlue and fT2F-pvalue represents rigid and flexible TOPS++FATCAT methods, respectively.
Figure 6
Figure 6
Graph showing the runtime and AFP analysis of the FATCAT (in green) and TOPS++FATCAT (in red) methods based on the flexible option, (a) runtime statistics, where the x-axis indicates the 1,901 SCOP domain pairs ordered by flexible_FATCAT runtime; (b) total number of AFP statistics, where the x-axis represents the 1,901 SCOP domain pairs ordered based on AFPs from the flexible_FATCAT method.
Figure 7
Figure 7
(a) Superposition of d2trxa_(gray) and d1kte__(orange) from flexible_FATCAT and d1kte__(blue) from flexible_TOPS++FATCAT; (b) AFP chaining alignment from flexible_FATCAT; (c) AFP chaining alignment from flexible_TOPS++FATCAT.
Figure 8
Figure 8
(a) Superposition of d1eca__ (gray) and d1cpca_ (orange) from flexible_FATCAT and d1cpca__ (blue) from flexible_TOPS++FATCAT; (b) AFP chaining alignment from flexible_FATCAT; (c) AFP chaining alignment from flexible_TOPS++FATCAT; (d) structural alignment from flexible_TOPS++FATCAT; (e) structural alignment from flexible_FATCAT.

Similar articles

Cited by

References

    1. Murzin AG, Brenner SE, Hubbard TJP, Chothia C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. Journal of Molecular Biology. 1995;247:536–540. - PubMed
    1. Orengo CA, Michie AD, Jones S, Jones DT, Swindells MB, Thornton JM. CATH - A hierarchical classification of protein domain structures. Structure. 1997;5:1093–1108. doi: 10.1016/S0969-2126(97)00260-8. - DOI - PubMed
    1. Holm L, Sander C. Dali: a network tool for protein structure comparison. Trends Biochem Sci. 1995;20:478–480. doi: 10.1016/S0968-0004(00)89105-7. - DOI - PubMed
    1. Shindyalov IN, Bourne PE. Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng. 1998;11:739–747. doi: 10.1093/protein/11.9.739. - DOI - PubMed
    1. Ye Y, Godzik A. Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics. 2003;19 Suppl 2:II246–II255. - PubMed

Publication types