Investigation and identification of protein carbonylation sites based on position-specific amino acid composition and physicochemical features
- PMID: 28361707
- PMCID: PMC5374553
- DOI: 10.1186/s12859-017-1472-8
Investigation and identification of protein carbonylation sites based on position-specific amino acid composition and physicochemical features
Abstract
Background: Protein carbonylation, an irreversible and non-enzymatic post-translational modification (PTM), is often used as a marker of oxidative stress. When reactive oxygen species (ROS) oxidized the amino acid side chains, carbonyl (CO) groups are produced especially on Lysine (K), Arginine (R), Threonine (T), and Proline (P). Nevertheless, due to the lack of information about the carbonylated substrate specificity, we were encouraged to develop a systematic method for a comprehensive investigation of protein carbonylation sites.
Results: After the removal of redundant data from multipe carbonylation-related articles, totally 226 carbonylated proteins in human are regarded as training dataset, which consisted of 307, 126, 128, and 129 carbonylation sites for K, R, T and P residues, respectively. To identify the useful features in predicting carbonylation sites, the linear amino acid sequence was adopted not only to build up the predictive model from training dataset, but also to compare the effectiveness of prediction with other types of features including amino acid composition (AAC), amino acid pair composition (AAPC), position-specific scoring matrix (PSSM), positional weighted matrix (PWM), solvent-accessible surface area (ASA), and physicochemical properties. The investigation of position-specific amino acid composition revealed that the positively charged amino acids (K and R) are remarkably enriched surrounding the carbonylated sites, which may play a functional role in discriminating between carbonylation and non-carbonylation sites. A variety of predictive models were built using various features and three different machine learning methods. Based on the evaluation by five-fold cross-validation, the models trained with PWM feature could provide better sensitivity in the positive training dataset, while the models trained with AAindex feature achieved higher specificity in the negative training dataset. Additionally, the model trained using hybrid features, including PWM, AAC and AAindex, obtained best MCC values of 0.432, 0.472, 0.443 and 0.467 on K, R, T and P residues, respectively.
Conclusion: When comparing to an existing prediction tool, the selected models trained with hybrid features provided a promising accuracy on an independent testing dataset. In short, this work not only characterized the carbonylated substrate preference, but also demonstrated that the proposed method could provide a feasible means for accelerating preliminary discovery of protein carbonylation.
Keywords: Amino acid composition; Physicochemical properties; Protein carbonylation; Reactive Oxygen Species (ROS).
Figures







Similar articles
-
MDD-carb: a combinatorial model for the identification of protein carbonylation sites with substrate motifs.BMC Syst Biol. 2017 Dec 21;11(Suppl 7):137. doi: 10.1186/s12918-017-0511-4. BMC Syst Biol. 2017. PMID: 29322938 Free PMC article.
-
SOHSite: incorporating evolutionary information and physicochemical properties to identify protein S-sulfenylation sites.BMC Genomics. 2016 Jan 11;17 Suppl 1(Suppl 1):9. doi: 10.1186/s12864-015-2299-1. BMC Genomics. 2016. PMID: 26819243 Free PMC article.
-
iDPGK: characterization and identification of lysine phosphoglycerylation sites based on sequence-based features.BMC Bioinformatics. 2020 Dec 9;21(1):568. doi: 10.1186/s12859-020-03916-5. BMC Bioinformatics. 2020. PMID: 33297954 Free PMC article.
-
[Protein carbonylation and its role in physiological processes in plants].Postepy Biochem. 2012;58(1):34-43. Postepy Biochem. 2012. PMID: 23214127 Review. Polish.
-
Carbonylation of proteins-an element of plant ageing.Planta. 2020 Jul 1;252(1):12. doi: 10.1007/s00425-020-03414-1. Planta. 2020. PMID: 32613330 Free PMC article. Review.
Cited by
-
Skin senescence-from basic research to clinical practice.Front Med (Lausanne). 2024 Oct 18;11:1484345. doi: 10.3389/fmed.2024.1484345. eCollection 2024. Front Med (Lausanne). 2024. PMID: 39493718 Free PMC article. Review.
-
dbAMP: an integrated resource for exploring antimicrobial peptides with functional activities and physicochemical properties on transcriptome and proteome data.Nucleic Acids Res. 2019 Jan 8;47(D1):D285-D297. doi: 10.1093/nar/gky1030. Nucleic Acids Res. 2019. PMID: 30380085 Free PMC article.
-
Acute total body ionizing gamma radiation induces long-term adverse effects and immediate changes in cardiac protein oxidative carbonylation in the rat.PLoS One. 2020 Jun 4;15(6):e0233967. doi: 10.1371/journal.pone.0233967. eCollection 2020. PLoS One. 2020. PMID: 32497067 Free PMC article.
-
Protein Carbonylation: Emerging Roles in Plant Redox Biology and Future Prospects.Plants (Basel). 2021 Jul 15;10(7):1451. doi: 10.3390/plants10071451. Plants (Basel). 2021. PMID: 34371653 Free PMC article. Review.
-
MDD-carb: a combinatorial model for the identification of protein carbonylation sites with substrate motifs.BMC Syst Biol. 2017 Dec 21;11(Suppl 7):137. doi: 10.1186/s12918-017-0511-4. BMC Syst Biol. 2017. PMID: 29322938 Free PMC article.
References
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources