Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Apr 2:8:113.
doi: 10.1186/1471-2105-8-113.

Improved residue contact prediction using support vector machines and a large feature set

Affiliations

Improved residue contact prediction using support vector machines and a large feature set

Jianlin Cheng et al. BMC Bioinformatics. .

Abstract

Background: Predicting protein residue-residue contacts is an important 2D prediction task. It is useful for ab initio structure prediction and understanding protein folding. In spite of steady progress over the past decade, contact prediction remains still largely unsolved.

Results: Here we develop a new contact map predictor (SVMcon) that uses support vector machines to predict medium- and long-range contacts. SVMcon integrates profiles, secondary structure, relative solvent accessibility, contact potentials, and other useful features. On the same test data set, SVMcon's accuracy is 4% higher than the latest version of the CMAPpro contact map predictor. SVMcon recently participated in the seventh edition of the Critical Assessment of Techniques for Protein Structure Prediction (CASP7) experiment and was evaluated along with seven other contact map predictors. SVMcon was ranked as one of the top predictors, yielding the second best coverage and accuracy for contacts with sequence separation > or = 12 on 13 de novo domains.

Conclusion: We describe SVMcon, a new contact map predictor that uses SVMs and a large set of informative features. SVMcon yields good performance on medium- to long-range contact predictions and can be modularly incorporated into a structure prediction pipeline.

PubMed Disclaimer

Figures

Figure 1
Figure 1
3D Structure of Protein 1DZOA. Protein 1DZOA is an a+b protein. It consists of two alpha helices and two beta sheets. Beta strands 1 and 2 form a parallel beta sheet. Beta strands 3, 4, 5, 6 form an anti-parallel beta sheet. Most non-local contacts involve the pairing interations between beta strands and the packing interactions between helices and beta sheets. (Figure rendered using Molscript [63]).
Figure 2
Figure 2
Predicted and True Contact Maps of 1DZOA. The upper triangle shows the true contacts of protein 1DZOA. The lower triangle shows the predicted contacts of protein 1DZOA. 2L (240) top ranked contacts are selected. The key contacts within anti-parallel strand pairs (3,4), (4,5), and (5,6) are recalled. A few contacts within the parallel strand pair (1,2) are also predicted correctly. However, very long range contacts between alpha helices and beta sheets are not predicted. And there are some false positives. It is interesting to see that most false positives are close to the true contacts. Thus, they may not be very harmful when being used as distance restraints to reconstruct protein 3D structure.

References

    1. Rost B, Liu J, Przybylski D, Nair R, Wrzeszczynski K, Bigelow H, Ofran Y. Prediction of protein structure through evolution. In: Gasteiger J, Engel T, editor. Handbook of Chemoinformatics – From Data to Knowledge. New York: Wiley; 2003. pp. 1789–1811.
    1. Olmea O, Rost B, Valencia A. Effective use of sequence correlation and conservation in fold recognition. J Mol Biol. 1999;295:1221–1239. - PubMed
    1. Cheng J, Baldi P. A Machine Learning Information Retrieval Approach to Protein Fold Recognition. Bioinformatics. 2006;22:1456–1463. - PubMed
    1. Bonneau R, Ruczinski I, Tsai J, Baker D. Contact order and ab initio protein structure prediction. Protein Sci. 2002;11:1937–1944. - PMC - PubMed
    1. Aszodi A, Gradwell M, Taylor W. Global fold determination from a small number of distance restraints. J Mol Biol. 1995;251:308–326. - PubMed

Publication types