Prediction of protein secondary structure content using amino acid composition and evolutionary information

Soyoung Lee¹, Byung-Chul Lee, Dongsup Kim

Affiliations

PMID: 16345074
DOI: 10.1002/prot.20821

Prediction of protein secondary structure content using amino acid composition and evolutionary information

Soyoung Lee et al. Proteins. 2006.

. 2006 Mar 1;62(4):1107-14.

doi: 10.1002/prot.20821.

Authors

Soyoung Lee¹, Byung-Chul Lee, Dongsup Kim

Affiliation

¹ Department of Biosystems, Korea Advanced Institute of Science and Technology, Daejeon, South Korea.

PMID: 16345074
DOI: 10.1002/prot.20821

Abstract

Knowing protein structure and inferring its function from the structure are one of the main issues of computational structural biology, and often the first step is studying protein secondary structure. There have been many attempts to predict protein secondary structure contents. Previous attempts assumed that the content of protein secondary structure can be predicted successfully using the information on the amino acid composition of a protein. Recent methods achieved remarkable prediction accuracy by using the expanded composition information. The overall average error of the most successful method is 3.4%. Here, we demonstrate that even if we only use the simple amino acid composition information alone, it is possible to improve the prediction accuracy significantly if the evolutionary information is included. The idea is motivated by the observation that evolutionarily related proteins share the similar structure. After calculating the homolog-averaged amino acid composition of a protein, which can be easily obtained from the multiple sequence alignment by running PSI-BLAST, those 20 numbers are learned by a multiple linear regression, an artificial neural network and a support vector regression. The overall average error of method by a support vector regression is 3.3%. It is remarkable that we obtain the comparable accuracy without utilizing the expanded composition information such as pair-coupled amino acid composition. This work again demonstrates that the amino acid composition is a fundamental characteristic of a protein. It is anticipated that our novel idea can be applied to many areas of protein bioinformatics where the amino acid composition information is utilized, such as subcellular localization prediction, enzyme subclass prediction, domain boundary prediction, signal sequence prediction, and prediction of unfolded segment in a protein sequence, to name a few.

2005 Wiley-Liss, Inc.

PubMed Disclaimer

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

LinkOut - more resources

Full Text Sources
- Wiley
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Prediction of protein secondary structure content using amino acid composition and evolutionary information

Affiliation

Prediction of protein secondary structure content using amino acid composition and evolutionary information

Authors

Affiliation

Abstract

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Research Materials