Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Apr 28:13:883766.
doi: 10.3389/fgene.2022.883766. eCollection 2022.

Ensemble-AHTPpred: A Robust Ensemble Machine Learning Model Integrated With a New Composite Feature for Identifying Antihypertensive Peptides

Affiliations

Ensemble-AHTPpred: A Robust Ensemble Machine Learning Model Integrated With a New Composite Feature for Identifying Antihypertensive Peptides

Supatcha Lertampaiporn et al. Front Genet. .

Abstract

Hypertension or elevated blood pressure is a serious medical condition that significantly increases the risks of cardiovascular disease, heart disease, diabetes, stroke, kidney disease, and other health problems, that affect people worldwide. Thus, hypertension is one of the major global causes of premature death. Regarding the prevention and treatment of hypertension with no or few side effects, antihypertensive peptides (AHTPs) obtained from natural sources might be useful as nutraceuticals. Therefore, the search for alternative/novel AHTPs in food or natural sources has received much attention, as AHTPs may be functional agents for human health. AHTPs have been observed in diverse organisms, although many of them remain underinvestigated. The identification of peptides with antihypertensive activity in the laboratory is time- and resource-consuming. Alternatively, computational methods based on robust machine learning can identify or screen potential AHTP candidates prior to experimental verification. In this paper, we propose Ensemble-AHTPpred, an ensemble machine learning algorithm composed of a random forest (RF), a support vector machine (SVM), and extreme gradient boosting (XGB), with the aim of integrating diverse heterogeneous algorithms to enhance the robustness of the final predictive model. The selected feature set includes various computed features, such as various physicochemical properties, amino acid compositions (AACs), transitions, n-grams, and secondary structure-related information; these features are able to learn more information in terms of analyzing or explaining the characteristics of the predicted peptide. In addition, the tool is integrated with a newly proposed composite feature (generated based on a logistic regression function) that combines various feature aspects to enable improved AHTP characterization. Our tool, Ensemble-AHTPpred, achieved an overall accuracy above 90% on independent test data. Additionally, the approach was applied to novel experimentally validated AHTPs, obtained from recent studies, which did not overlap with the training and test datasets, and the tool could precisely predict these AHTPs.

Keywords: ACE inhibitor; ACE inhibitory peptide; antihypertensive; classification; ensemble machine learning; prediction.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Workflow of the proposed approach.
FIGURE 2
FIGURE 2
Percent average composition of amino acid residues present in the positive and negative datasets.
FIGURE 3
FIGURE 3
N-terminal features of AHTP positive data and non-AHTP negative data: (A) Heatmap of log odds ratios, where a lighter color denotes overrepresented amino acid residues in AHTPs compared to non-AHTPs (positive log odds score) and, a darker color denotes underrepresented amino acid residues in AHTPs compared to non-AHTPs (negative log odds score). (B) Sequence logos of positions one to five of the AHTP positive dataset. (C) Sequence logos of positions one to five of the non-AHTP negative dataset.
FIGURE 4
FIGURE 4
C-terminal analysis of AHTP positive data and non-AHTP negative data: (A) Heatmap of log odds ratios, where a lighter color denotes overrepresented amino acid residues in AHTPs compared to non-AHTPs (positive log odds score) and a darker color denotes underrepresented amino acid residues in AHTPs compared to non-AHTPs (negative log odds score). (B) Sequence logos of positions one to five of the AHTP positive dataset. (C) Sequence logos of C-terminal positions one to five of the non-AHTP negative dataset.
FIGURE 5
FIGURE 5
Heatmap of the log odds scores of 2-mers abundant in the positive versus negative datasets. In the heatmap, a red color (high log odds score) denotes 2-mers overrepresented in AHTPs compared to non-AHTPs, and a white color (low log odds score) denotes 2-mers underrepresented in AHTPs compared to non-AHTPs.
FIGURE 6
FIGURE 6
ROC curves of individual machine learning models.
FIGURE 7
FIGURE 7
Importance plots and SHAP plot: (A) Importance plots yield by the RF (left: permutation importance; right: Gini importance). (B) Importance plot yielded by the XGB model. (C) SHAP summary plot of the top 15 features; (D) dependence plot of composite feature comF2 for the AHTP class. (E) Density distribution of the SHAP plot’s top six features (sample) by class in the training data.

References

    1. Abachi S., Bazinet L., Beaulieu L. (2019). Antihypertensive and Angiotensin-I-Converting Enzyme (ACE)-Inhibitory Peptides from Fish as Potential Cardioprotective Compounds. Mar. Drugs 17 (11), 613. 10.3390/md17110613 - DOI - PMC - PubMed
    1. Aluko R. E. (2015). Antihypertensive Peptides from Food Proteins. Annu. Rev. Food Sci. Technol. 6, 235–262. PMID: 25884281. 10.1146/annurev-food-022814-015520 - DOI - PubMed
    1. Ankhi H., Madhushrita D., K D. T. D., Pubali D., Jana C. (2022). Isolation of an Antihypertensive Bioactive Peptide from the Freshwater Mussel Lamellidens Marginalis. Int. J. Food Nutr. Sci. 11, 1–8. 10.54876/ijfans_01-08 - DOI
    1. Asoodeh A., Homayouni-Tabrizi M., Shabestarian H., Emtenani S., Emtenani S. (2016). Biochemical Characterization of a Novel Antioxidant and Angiotensin I-Converting Enzyme Inhibitory Peptide from Struthio camelus Egg white Protein Hydrolysis. J. Food Drug Anal. 24 (2), 332–342. 10.1016/j.jfda.2015.11.010 - DOI - PMC - PubMed
    1. Balgir P. P., Sharma M. (2017). Biopharmaceutical Potential of ACE-Inhibitory Peptides. J. Proteomics Bioinform. 10, 171–177. 10.4172/jpb.1000437 - DOI