Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jul 27:9:1695.
doi: 10.3389/fimmu.2018.01695. eCollection 2018.

iBCE-EL: A New Ensemble Learning Framework for Improved Linear B-Cell Epitope Prediction

Affiliations

iBCE-EL: A New Ensemble Learning Framework for Improved Linear B-Cell Epitope Prediction

Balachandran Manavalan et al. Front Immunol. .

Abstract

Identification of B-cell epitopes (BCEs) is a fundamental step for epitope-based vaccine development, antibody production, and disease prevention and diagnosis. Due to the avalanche of protein sequence data discovered in postgenomic age, it is essential to develop an automated computational method to enable fast and accurate identification of novel BCEs within vast number of candidate proteins and peptides. Although several computational methods have been developed, their accuracy is unreliable. Thus, developing a reliable model with significant prediction improvements is highly desirable. In this study, we first constructed a non-redundant data set of 5,550 experimentally validated BCEs and 6,893 non-BCEs from the Immune Epitope Database. We then developed a novel ensemble learning framework for improved linear BCE predictor called iBCE-EL, a fusion of two independent predictors, namely, extremely randomized tree (ERT) and gradient boosting (GB) classifiers, which, respectively, uses a combination of physicochemical properties (PCP) and amino acid composition and a combination of dipeptide and PCP as input features. Cross-validation analysis on a benchmarking data set showed that iBCE-EL performed better than individual classifiers (ERT and GB), with a Matthews correlation coefficient (MCC) of 0.454. Furthermore, we evaluated the performance of iBCE-EL on the independent data set. Results show that iBCE-EL significantly outperformed the state-of-the-art method with an MCC of 0.463. To the best of our knowledge, iBCE-EL is the first ensemble method for linear BCEs prediction. iBCE-EL was implemented in a web-based platform, which is available at http://thegleelab.org/iBCE-EL. iBCE-EL contains two prediction modes. The first one identifying peptide sequences as BCEs or non-BCEs, while later one is aimed at providing users with the option of mining potential BCEs from protein sequences.

Keywords: B-cell epitope; ensemble learning; extremely randomized tree; gradient boosting; immunotherapy.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Overall framework of the proposed predictor. iBCE-EL development involved the following steps: (1) dataset curation, (2) feature extraction, (3) exploration of six different ML algorithms and selection of an appropriate algorithm and the corresponding features, and (4) construction of ensemble model.
Figure 2
Figure 2
Compositional and positional preference analysis. (A) and (B) respectively represent the amino acid and dipeptide preferences of BCEs and non-BCEs. (B) Shows significant differences in top 30 dipeptides. (C,D) Represent positional conservation of 10 residues at the N- and C-terminals, respectively, between BCEs and non-BCEs, generated using two sample logos. In (A,B), error bar is the SE that indicates the reliability of the mean. A smaller SE indicates that the sample mean is more accurate reflection of the actual population mean.
Figure 3
Figure 3
Performance of six different ML-based classifiers. Performance of various classifiers in distinguishing between B-cell epitopes (BCEs) and non-BCEs. A total of 27 classifiers were evaluated using 10 independent 5-fold cross-validation techniques, and their average performances in terms of AUC is shown. The final selected model for each ML-based method is shown with arrows. Abbreviations: AAC, amino acid composition; DPC, dipeptide composition; CTD, chain-transition-distribution; AAI, amino acid index; PCP, physicochemical properties; H1: AAC + AAI; H2: AAC + DPC + AAI; H3: AAC + DPC + AAI + CTD; H4: AAC + DPC + AAI + CTD + PCP; H5: AAC + DPC; H6: AAC + CTD; H7: AAC + PCP; H8: AAI + DPC; H9: AAI + DPC + CTD; H10: AAI + DPC + CTD + PCP; H11: AAI + CTD; H12: AAI + PCP; H13: DPC + CTD; H14: DPC + CTD + PCP; H15: DPC + PCP; H16: CTD + DPC; H17: AAC + AAI + PCP; N5: BPFN5; C5: BPFC5; N5C5: BPFN5 + BPFC5; N10: BPFN10; C10: BPFC10; and N10C10: BPFN10 + BPFC10.
Figure 4
Figure 4
Optimization of probability value threshold. The x- and y-axes, respectively, represent the probability value threshold and Matthews correlation coefficient. The optimal value selected for each method is shown with a circle.
Figure 5
Figure 5
Receiver operating characteristic curves of the various prediction models. Results of 5-fold cross-validation on (A) a benchmarking data set and (B) independent data set.

Similar articles

Cited by

References

    1. Getzoff ED, Tainer JA, Lerner RA, Geysen HM. The Chemistry and Mechanism of Antibody Binding to Protein Antigens. Advances in immunology. 43. Elsevier; (1988). p. 1–98. - PubMed
    1. Katsumata M. Promotion of intramuscular fat accumulation in porcine muscle by nutritional regulation. Anim Sci J (2011) 82(1):17–25.10.1111/j.1740-0929.2010.00844.x - DOI - PubMed
    1. Webster SD, Galvan MD, Ferran E, Garzon-Rodriguez W, Glabe CG, Tenner AJ. Antibody-mediated phagocytosis of the amyloid beta-peptide in microglia is differentially modulated by C1q. J Immunol (2001) 166(12):7496–503.10.4049/jimmunol.166.12.7496 - DOI - PubMed
    1. Feldmann M, Maini RN. Anti-TNF alpha therapy of rheumatoid arthritis: what have we learned? Annu Rev Immunol (2001) 19:163–96.10.1146/annurev.immunol.19.1.163 - DOI - PubMed
    1. Potocnakova L, Bhide M, Pulzova LB. An introduction to B-cell epitope mapping and in silico epitope prediction. J Immunol Res (2016) 2016:6760830.10.1155/2016/6760830 - DOI - PMC - PubMed

Publication types