State of the art and challenges in sequence based T-cell epitope prediction

Claus Lundegaard¹, Ilka Hoof, Ole Lund, Morten Nielsen

Affiliations

Affiliation

¹ The Technical University of Denmark - DTU, Dept, of Systems Biology, Center for Biological Sequence Analysis - CBS, Kemitorvet 208, DK-2800 Kgs, Lyngby, Denmark. lunde@cbs.dtu.dk.

PMID: 21067545
PMCID: PMC2981877
DOI: 10.1186/1745-7580-6-S2-S3

State of the art and challenges in sequence based T-cell epitope prediction

Claus Lundegaard et al. Immunome Res. 2010.

. 2010 Nov 3;6 Suppl 2(Suppl 2):S3.

doi: 10.1186/1745-7580-6-S2-S3.

Authors

Claus Lundegaard¹, Ilka Hoof, Ole Lund, Morten Nielsen

Affiliation

¹ The Technical University of Denmark - DTU, Dept, of Systems Biology, Center for Biological Sequence Analysis - CBS, Kemitorvet 208, DK-2800 Kgs, Lyngby, Denmark. lunde@cbs.dtu.dk.

PMID: 21067545
PMCID: PMC2981877
DOI: 10.1186/1745-7580-6-S2-S3

Abstract

Sequence based T-cell epitope predictions have improved immensely in the last decade. From predictions of peptide binding to major histocompatibility complex molecules with moderate accuracy, limited allele coverage, and no good estimates of the other events in the antigen-processing pathway, the field has evolved significantly. Methods have now been developed that produce highly accurate binding predictions for many alleles and integrate both proteasomal cleavage and transport events. Moreover have so-called pan-specific methods been developed, which allow for prediction of peptide binding to MHC alleles characterized by limited or no peptide binding data. Most of the developed methods are publicly available, and have proven to be very useful as a shortcut in epitope discovery. Here, we will go through some of the history of sequence-based predictions of helper as well as cytotoxic T cell epitopes. We will focus on some of the most accurate methods and their basic background.

PubMed Disclaimer

Figures

**Figure 1**
**Depiction of the supertype concept.** Example alleles including alleles common in the western European populations were assigned to four supertypes using the scheme from Sidney et al. [63]. The amino acid preferences at each position in a nonamer peptide is shown for each of the alleles using sequence logo plots taken from MHCMotifViewer [129]. Amino acids with positive influence on the binding are plotted on the positive y-axis, and amino acids with a negative influence on binding are plotted on the negative y-axis. The height of each amino acid is given by their relative contribution to the binding specificity.

**Figure 2**
**Description of the NetMHCpan approach.** A) Amino acids used for prediction are residues from the MHC alpha chain that are found to be in contact with the peptide using structural data (blue in left) and the full binding peptide (right). B) The identified MHC residues in the amino acid sequence of HLA-A*0201 are labeled blue. C) The labeled residues from B presented as a pseudo sequence (left) and the peptide sequence (right). D) Pairs of peptide sequences (left) and pseudo sequences (second from left) are presented to the ANN with the experimentally determined log scaled affinity (far right). The displayed allele information is not an input to the ANN. During the training the weights are adjusted in order to minimize the error between predicted output and the assigned affinity.

**Figure 3**
**Schematic overview of the NN-align algorithm.** The artificial neural network (initially with assigned random weights) is used to predict the binding affinity for a given peptide (right panel). The peptide (shown in light blue) is partitioned into overlapping 9mers, and the binding affinity is predicted encoding the 9mer binding core combined with information about the peptide flanking residues (PFR), the length of the PFR and the peptide length as described in the text. The binding affinity of the peptide is assigned from the highest scoring sub-peptide (shown in red). Next, the ANN weight configuration is updated using back-propagation to minimize the squared error between the predicted and measured binding affinities. This is repeated in a cycle for all peptides in the training data set for a given number of iterations.

See this image and copyright information in PMC

References

1. Heeney JL. Zoonotic viral diseases and the frontier of early diagnosis, control and prevention. J Intern Med. 2006;260:399–408. doi: 10.1111/j.1365-2796.2006.01711.x. - DOI - PubMed
1. White PJ, Norman RA, Trout RC, Gould EA, Hudson PJ. The emergence of rabbit haemorrhagic disease virus: will a non-pathogenic strain protect the UK? Philos Trans R Soc Lond B Biol Sci. 2001;356:1087–1095. doi: 10.1098/rstb.2001.0897. - DOI - PMC - PubMed
1. Perlman S, Netland J. Coronaviruses post-SARS: update on replication and pathogenesis. Nat Rev Microbiol. 2009;7:439–450. doi: 10.1038/nrmicro2147. - DOI - PMC - PubMed
1. de Wit E, Kawaoka Y, de Jong MD, Fouchier RAM. Pathogenicity of highly pathogenic avian influenza virus in mammals. Vaccine. 2008;26(Suppl 4):D54–D58. doi: 10.1016/j.vaccine.2008.07.072. - DOI - PMC - PubMed
1. Lister P, Reynolds F, Parslow R, Chan A, Cooper M, Plunkett A, Riphagen S, Peters M. Swine-origin influenza virus H1N1, seasonal influenza virus, and critical illness in children. Lancet. 2009;374:605–607. doi: 10.1016/S0140-6736(09)61512-9. - DOI - PubMed

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central
Other Literature Sources
- The Lens - Patent Citations Database
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

State of the art and challenges in sequence based T-cell epitope prediction

Affiliation

State of the art and challenges in sequence based T-cell epitope prediction

Authors

Affiliation

Abstract

Figures

References

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials