Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Oct 28:14:314.
doi: 10.1186/1471-2105-14-314.

Identification of properties important to protein aggregation using feature selection

Affiliations

Identification of properties important to protein aggregation using feature selection

Yaping Fang et al. BMC Bioinformatics. .

Abstract

Background: Protein aggregation is a significant problem in the biopharmaceutical industry (protein drug stability) and is associated medically with over 40 human diseases. Although a number of computational models have been developed for predicting aggregation propensity and identifying aggregation-prone regions in proteins, little systematic research has been done to determine physicochemical properties relevant to aggregation and their relative importance to this important process. Such studies may result in not only accurately predicting peptide aggregation propensities and identifying aggregation prone regions in proteins, but also aid in discovering additional underlying mechanisms governing this process.

Results: We use two feature selection algorithms to identify 16 features, out of a total of 560 physicochemical properties, presumably important to protein aggregation. Two predictors (ProA-SVM and ProA-RF) using selected features are built for predicting peptide aggregation propensity and identifying aggregation prone regions in proteins. Both methods are compared favourably to other state-of-the-art algorithms in cross validation. The identified important properties are fairly consistent with previous studies and bring some new insights into protein and peptide aggregation. One interesting new finding is that aggregation prone peptide sequences have similar properties to signal peptide and signal anchor sequences.

Conclusions: Both predictors are implemented in a freely available web application (http://www.abl.ku.edu/ProA/). We suggest that the quaternary structure of protein aggregates, especially soluble oligomers, may allow the formation of new molecular recognition signals that guide aggregate targeting to specific cellular sites.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Dependency of classification performance on the numbers of selected features A) Classification error plotted against the number of feature selected by SVM-RFE, B) OOB error plotted against the number of feature selected by RF-IE. “Class error” equals to 1 minus classification accuracy, and “OOB error” is the abbreviation of out-of-bag (OOB) error rate which represents error rate for classification.
Figure 2
Figure 2
The receiver operator characteristic (ROC) curves curve for different methods. Area Under the ROC Curve (AUC): ProA-RF: 0.8929; ProA-SVM: 0.8680, ZYGGREGATOR: 0.8395; AAGRESCAN: 0.8336; FoldAmyloid: 0.7946; PAGE: 0.7303, TANGO: 0.7121.
Figure 3
Figure 3
The predicted aggregation regions of tau protein (region 244–368) by different methods. A. The predicted aggregation propensity profiles of ProA-SVM (dashed line), ProA-RF (solid line); B. The predicted aggregation propensity profiles of ZYGGREGATOR (blue), AAGRESCAN (red), FoldAmyloid (black), PAGE (purple), and TANGO (green). “A” and “N” indicate experimentally confirmed aggregation and non-aggregation regions, respectively.

References

    1. Ventura S, Villaverde A. Protein quality in bacterial inclusion bodies. Trends Biotechnol. 2006;24(4):179–185. doi: 10.1016/j.tibtech.2006.02.007. - DOI - PubMed
    1. Weiss WF, Young TM, Roberts CJ. Principles, Approaches, and Challenges for Predicting Protein Aggregation Rates and Shelf Life. J Pharm Sci-Us. 2009;98(4):1246–1277. doi: 10.1002/jps.21521. - DOI - PubMed
    1. Tartaglia GG, Cavalli A, Pellarin R, Caflisch A. The role of aromaticity, exposed surface, and dipole moment in determining protein aggregation rates. Protein Sci. 2004;13(7):1939–1941. doi: 10.1110/ps.04663504. - DOI - PMC - PubMed
    1. Badtke MP, Hammer ND, Chapman MR. Functional amyloids signal their arrival. Sci Signal. 2009;2(80):pe43. doi: 10.1126/scisignal.280pe43. - DOI - PMC - PubMed
    1. Olzscha H, Schermann SM, Woerner AC, Pinkert S, Hecht MH, Tartaglia GG, Vendruscolo M, Hayer-Hartl M, Hartl FU, Vabulas RM. Amyloid-like aggregates sequester numerous metastable proteins with essential cellular functions. Cell. 2011;144(1):67–78. doi: 10.1016/j.cell.2010.11.050. - DOI - PubMed

Publication types

LinkOut - more resources