Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;8(12):e1002829.
doi: 10.1371/journal.pcbi.1002829. Epub 2012 Dec 27.

Reliable B cell epitope predictions: impacts of method development and improved benchmarking

Affiliations

Reliable B cell epitope predictions: impacts of method development and improved benchmarking

Jens Vindahl Kringelum et al. PLoS Comput Biol. 2012.

Abstract

The interaction between antibodies and antigens is one of the most important immune system mechanisms for clearing infectious organisms from the host. Antibodies bind to antigens at sites referred to as B-cell epitopes. Identification of the exact location of B-cell epitopes is essential in several biomedical applications such as; rational vaccine design, development of disease diagnostics and immunotherapeutics. However, experimental mapping of epitopes is resource intensive making in silico methods an appealing complementary approach. To date, the reported performance of methods for in silico mapping of B-cell epitopes has been moderate. Several issues regarding the evaluation data sets may however have led to the performance values being underestimated: Rarely, all potential epitopes have been mapped on an antigen, and antibodies are generally raised against the antigen in a given biological context not against the antigen monomer. Improper dealing with these aspects leads to many artificial false positive predictions and hence to incorrect low performance values. To demonstrate the impact of proper benchmark definitions, we here present an updated version of the DiscoTope method incorporating a novel spatial neighborhood definition and half-sphere exposure as surface measure. Compared to other state-of-the-art prediction methods, Discotope-2.0 displayed improved performance both in cross-validation and in independent evaluations. Using DiscoTope-2.0, we assessed the impact on performance when using proper benchmark definitions. For 13 proteins in the training data set where sufficient biological information was available to make a proper benchmark redefinition, the average AUC performance was improved from 0.791 to 0.824. Similarly, the average AUC performance on an independent evaluation data set improved from 0.712 to 0.727. Our results thus demonstrate that given proper benchmark definitions, B-cell epitope prediction methods achieve highly significant predictive performances suggesting these tools to be a powerful asset in rational epitope discovery. The updated version of DiscoTope is available at www.cbs.dtu.dk/services/DiscoTope-2.0.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Cross-validated performance.
Performances of different methods for predicting B-cell epitopes evaluated on the DiscoTope dataset. From left to right: The original DiscoTope method, the uncombined log-odds ratio scores as described in text, the surface measures; UHS, RSA, FS HSE and Ta (see text) and the DiscoTope2.0 method as described in text. Performance of the original DiscoTope method was obtained from .
Figure 2
Figure 2. Illustration of benchmark redefinition on Lysozyme.
6 unique discontinuous epitopes have been identified for lysozyme. Including this comprehensive information on multiple epitopes for Lysozyme, the reported performance is increased. Predictions are illustrated as a heatmap on the protein surface where Red = high prediction score, Blue = low prediction score.
Figure 3
Figure 3. Effect of benchmark redefinition and inclusion of biological units in prediction accuracy for the subset of 13 affected homology groups (see text).
Refer to Table S1 for complete definition of protein names.
Figure 4
Figure 4. Enhance prediction accuracy by inclusion of structural data of the biological unit.
Illustration of prediction for KvAP potassium channel. Left: using only one antigen chain, middle: using the biological tetramer, right: Excluding membrane and cytoplasmic residues. Predictions are illustrated as a heatmap on the protein surface where Red = high prediction score, Blue = low prediction score. Note, that the stated performances are for the PDB entry 1K4C and not the complete potassium homology group.
Figure 5
Figure 5. Predictions for Gp120 plotted on the protein structure including bound antibody.
Each residue in the structure is colored from blue to red according to its DiscoTope-2.0 score. Blue indicates low scores (predicted to be non-epitope residue) and red indicates high scores (predicted to be epitope residue). Yellow indicates possible glycosylation sites retrieved from UNIPROT accession number P04578 (www.uniprot.org). a) Gp120 surface representation and antibody cartoon representation. b) Gp120 and antibody cartoon representation. Note, the red alpha-1 helix, which is normally buried in the inner domain of Gp120 involved in Gp41∶Gp120 complex formation, is exposed in the crystal structure.

References

    1. Gershoni JM, Roitburd-Berman A, Siman-Tov DD, Tarnovitski Freund N, Weiss Y (2007) Epitope mapping: the first step in developing epitope-based vaccines. BioDrugs 21: 145–156. - PMC - PubMed
    1. Irving MB, Pan O, Scott JK (2001) Random-peptide libraries and antigen-fragment libraries for epitope mapping and the development of vaccines and diagnostics. Curr Opin Chem Biol 5: 314–324. - PMC - PubMed
    1. Ponomarenko JV, Bourne PE (2007) Antibody-protein interactions: benchmark datasets and prediction tools evaluation. BMC Structural Biology 7: 64 doi:10.1186/1472-6807-7-64. - DOI - PMC - PubMed
    1. El-Manzalawy Y, Honavar V (2010) Recent advances in B-cell epitope prediction methods. Immunome Res 6 Suppl 2: S2 doi:10.1186/1745-7580-6-S2-S2. - DOI - PMC - PubMed
    1. Hopp TP, Woods KR (1981) Prediction of protein antigenic determinants from amino acid sequences. Proc Natl Acad Sci U S A 78: 3824–3828. - PMC - PubMed

Publication types