Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2008 Mar 17:8:18.
doi: 10.1186/1472-6807-8-18.

A multi-template combination algorithm for protein comparative modeling

Affiliations
Comparative Study

A multi-template combination algorithm for protein comparative modeling

Jianlin Cheng. BMC Struct Biol. .

Abstract

Background: Multiple protein templates are commonly used in manual protein structure prediction. However, few automated algorithms of selecting and combining multiple templates are available.

Results: Here we develop an effective multi-template combination algorithm for protein comparative modeling. The algorithm selects templates according to the similarity significance of the alignments between template and target proteins. It combines the whole template-target alignments whose similarity significance score is close to that of the top template-target alignment within a threshold, whereas it only takes alignment fragments from a less similar template-target alignment that align with a sizable uncovered region of the target. We compare the algorithm with the traditional method of using a single top template on the 45 comparative modeling targets (i.e. easy template-based modeling targets) used in the seventh edition of Critical Assessment of Techniques for Protein Structure Prediction (CASP7). The multi-template combination algorithm improves the GDT-TS scores of predicted models by 6.8% on average. The statistical analysis shows that the improvement is significant (p-value < 10-4). Compared with the ideal approach that always uses the best template, the multi-template approach yields only slightly better performance. During the CASP7 experiment, the preliminary implementation of the multi-template combination algorithm (FOLDpro) was ranked second among 67 servers in the category of high-accuracy structure prediction in terms of GDT-TS measure.

Conclusion: We have developed a novel multi-template algorithm to improve protein comparative modeling.

PubMed Disclaimer

Figures

Figure 1
Figure 1
An automated multi-template comparative modeling pipeline.
Figure 2
Figure 2
GDT-TS scores of 45 comparative modeling targets (multi-template versus single-template). For 38 out of 45 targets, the multi-template approach yields higher GDT-TS scores than the single-template approach. The dots above the line represent the targets where the multi-template method yields higher scores, on the line where two methods yields the same scores, and below the line where the single-template method yields higher scores.
Figure 3
Figure 3
GDT-TS scores of 23 high-accuracy targets (multi-template versus single-template). For 20 out of 23 domains (dots above the line), the multi-template approach yields higher GDT-TS scores than the single-template approach.
Figure 4
Figure 4
GDT-TS scores of the 27 comparative modeling targets (multi-template versus best-template). For 16 out of 27 targets (points above the line), the multi-template approach yields higher GDT-TS scores than the best-template approach.
Figure 5
Figure 5
An good example (CASP7 target T0315) of using multiple templates to improve model quality. (1) The superimposition of the experimental structure (PDB code: 2GZX) and the best template (PDB code: 1J6O). Blue and red lines represent the backbone of the experimental and template structures, respectively. One bad region is identified. (2) The superimposition of the experimental structure and a good template (PDB code: 1YIX). One bad region is identified. (3) The superimposition of the experimental structure and the model generated by 3Dpro during CASP7, based on multiple templates). Two bad regions in (1) and (2) are corrected in the model (3). Most other regions of the model are also closer to the experimental structure than the two templates.
Figure 6
Figure 6
Multi-Template Selection and Combination Algorithm.

Similar articles

Cited by

References

    1. Vitkup D, Melamud E, Moult J, Sander C. Completeness in structural genomics. Nature Struct Biol. 2001;8:559–566. - PubMed
    1. Brenner S. A tour of structural genomics. Nature Rev Genet. 2001;2:801–809. - PubMed
    1. Westbrook J, Feng Z, Chen L, Yang H, Berman H. The protein data bank and structural geomics. Nucleic Acids Res. 2003;31:489–491. - PMC - PubMed
    1. Browne W, North A, Philips D, Brew K, Vanaman T, Hill R. A possible three-dimensional structure of bovine alpha-lactalbumin based on that of hen.s egg-white lysozyme. J Mol Biol. 1969;42:65–86. - PubMed
    1. Blundell T, Sibanda B, Sternberg M, Thornton J. Knowledge-based prediction of protein structures and the design of novel molecules. Nature. 1987;326:347–352. - PubMed

Publication types

LinkOut - more resources