Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 May 24;8 Suppl 5(Suppl 5):S3.
doi: 10.1186/1471-2105-8-S5-S3.

Learning biophysically-motivated parameters for alpha helix prediction

Affiliations

Learning biophysically-motivated parameters for alpha helix prediction

Blaise Gassend et al. BMC Bioinformatics. .

Abstract

Background: Our goal is to develop a state-of-the-art protein secondary structure predictor, with an intuitive and biophysically-motivated energy model. We treat structure prediction as an optimization problem, using parameterizable cost functions representing biological "pseudo-energies". Machine learning methods are applied to estimate the values of the parameters to correctly predict known protein structures.

Results: Focusing on the prediction of alpha helices in proteins, we show that a model with 302 parameters can achieve a Qalpha value of 77.6% and an SOValpha value of 73.4%. Such performance numbers are among the best for techniques that do not rely on external databases (such as multiple sequence alignments). Further, it is easier to extract biological significance from a model with so few parameters.

Conclusion: The method presented shows promise for the prediction of protein secondary structure. Biophysically-motivated elementary free-energies can be learned using SVM techniques to construct an energy cost function whose predictive performance rivals state-of-the-art. This method is general and can be extended beyond the all-alpha case described here.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Predictor finite state machine. Double circles represent accept states. The arrow leading into state C3 indicates that it is an initial state. Each transition is labeled with the type of structure it corresponds to: helix (H) or coil (C), and a label (#i) indicating which features correspond to this transition in Table 4.
Figure 2
Figure 2
Qα accuracy histogram. Histogram showing the distribution of Qα across proteins in the test set. We have shown the average case, and the best of the 20 runs which has the highest Qα.
Figure 3
Figure 3
SOVα accuracy histogram. Histogram showing the distribution of SOVα across proteins in the test set. We have shown the average case, and the best of the 20 runs which has the highest SOVα.
Figure 4
Figure 4
Summary of learning algorithm. In this figure each large frame represents a problem that needs to be solved. On the left, we start with an intractably large problem. At each iteration, we pick a subset of the large problem to work on, solve it approximately using an SVM formulation, and use the resulting solution to expand the subset of constraints we are working with.
Figure 5
Figure 5
Algorithm for iterative contraint based optimization. Algorithm for iterative constraint based optimization.

References

    1. Eyrich V, et al. EVA: Continuous automatic evaluation of protein structure prediction servers. Bioinformatics. 2001;17:1242–1243. - PubMed
    1. Rost B. Review: Protein Secondary Structure Prediction Continues to Rise. Journal of Structural Biology. 2001;134:204–218. - PubMed
    1. Zemla A, Ceslovas Venclovas, Fidelis K, Rost B. A Modified Definition of Sov, a Segment-Based Measure for Protein Secondary Structure Prediction Assessment. Proteins. 1999;34:220–223. - PubMed
    1. Jones DT. Protein Secondary Structure Prediction Based on Position-specific Scoring Matrices. Journal of Molecular Biology. 1999;292:195–202. - PubMed
    1. Nguyen MN, Rajapakse JC. Prediction of protein secondary structure using bayesian method and support vector machines. ICONIP. 2002.

LinkOut - more resources