Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Mar 11:12:640725.
doi: 10.3389/fimmu.2021.640725. eCollection 2021.

TCRMatch: Predicting T-Cell Receptor Specificity Based on Sequence Similarity to Previously Characterized Receptors

Affiliations

TCRMatch: Predicting T-Cell Receptor Specificity Based on Sequence Similarity to Previously Characterized Receptors

William D Chronister et al. Front Immunol. .

Abstract

The adaptive immune system in vertebrates has evolved to recognize non-self antigens, such as proteins expressed by infectious agents and mutated cancer cells. T cells play an important role in antigen recognition by expressing a diverse repertoire of antigen-specific receptors, which bind epitopes to mount targeted immune responses. Recent advances in high-throughput sequencing have enabled the routine generation of T-cell receptor (TCR) repertoire data. Identifying the specific epitopes targeted by different TCRs in these data would be valuable. To accomplish that, we took advantage of the ever-increasing number of TCRs with known epitope specificity curated in the Immune Epitope Database (IEDB) since 2004. We compared seven metrics of sequence similarity to determine their power to predict if two TCRs have the same epitope specificity. We found that a comprehensive k-mer matching approach produced the best results, which we have implemented into TCRMatch, an openly accessible tool (http://tools.iedb.org/tcrmatch/) that takes TCR β-chain CDR3 sequences as an input, identifies TCRs with a match in the IEDB, and reports the specificity of each match. We anticipate that this tool will provide new insights into T cell responses captured in receptor repertoire and single cell sequencing experiments and will facilitate the development of new strategies for monitoring and treatment of infectious, allergic, and autoimmune diseases, as well as cancer.

Keywords: IEDB; T cell; epitope; epitope prediction tool; immune repertoire analysis; sequence similarity.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Example precision and recall calculation. Using a TCRMatch score cutoff of 0.9, the input sequence matches 3 out of 4 tested sequences. Of the 5 epitopes recognized by the 3 matches, 2 epitopes (E1, E1) are shared with A (true positives, TP) and 3 epitopes (E4, E5, E4) are not shared with A (false positives, FP), resulting in a precision of 2/5. Of the input sequence's three epitopes (E1, E2, E3), E1 is also recognized by the first and third match, while E2 and E3 are not recognized by any of the matches; therefore, recall = 1/3.
Figure 2
Figure 2
Precision-recall and ROC plots comparing different sequence similarity metrics. (A) Each similarity metric was evaluated at different thresholds for its ability to recall TCR sequences in the IEDB that recognized the same epitope (x-axis) and compared that to the precision at the same threshold (y-axis), which specifies the percentage of match epitopes that were also recognized by the input sequences. The gray dashed line indicates average performance on the randomized IEDB dataset, wherein CDR3β-epitope pairs were shuffled. (B) All similarity metrics were evaluated for their performance as measured by true positive rate (TPR, y-axis) and false positive rate (FPR, x-axis). The axis ranges of 0–0.005 show the differences in performance among similarity metrics at data points where recall < 0.5, as determined from the analysis shown in (A). The dashed line indicates a random baseline for which TPR = FPR.
Figure 3
Figure 3
Precision-recall plots comparing performance of sequence similarity metrics on 10x dataset. Precision and recall were calculated for all seven metrics across similarity thresholds in the analysis comparing the 10x dataset against IEDB.
Figure 4
Figure 4
Precision-recall plots comparing performance of sequence similarity metrics on paired CDR3α-CDR3β data. Precision and recall were calculated for all seven metrics across similarity thresholds on three related datasets: (A) CDR3α-CDR3β pairs, (B) CDR3α sequences only, and (C) CDR3β sequences only.
Figure 5
Figure 5
Flowchart of TCRMatch. The user provides one or more CDR3β sequences and selects a similarity cutoff. If N-terminal cysteine (C) and C-terminal phenylalanine (F) or tryptophan (W) are present, these residues are removed prior to the similarity search against the IEDB CDR3β sequences. The chosen similarity cutoff is used to filter TCRMatch's final results, which consist of matching sequences and corresponding epitopes from the IEDB.

References

    1. Buchholz VR, Schumacher TNM, Busch DH. T cell fate at the single-cell level. Annu. Rev. Immunol. (2016) 34:65–92. 10.1146/annurev-immunol-032414-112014 - DOI - PubMed
    1. Bradley P, Thomas PG. Using T cell receptor repertoires to understand the principles of adaptive immune recognition. Annu. Rev. Immunol. (2019) 37:547–70. 10.1146/annurev-immunol-042718-041757 - DOI - PubMed
    1. Rudolph MG, Stanfield RL, Wilson IA. HOW TCRS bind MHCS peptides, and coreceptors. Annu Rev Immunol. (2006) 24:419–66. 10.1146/annurev.immunol.23.021704.115658 - DOI - PubMed
    1. Rossjohn J, Gras S, Miles JJ, Turner SJ, Godfrey DI, McCluskey J. T cell antigen receptor recognition of antigen-presenting molecules. Annu Rev Immunol. (2015) 33:169–200. 10.1146/annurev-immunol-032414-112334 - DOI - PubMed
    1. Calis JJA, Rosenberg BR. Characterizing immune repertoires by high throughput sequencing: strategies and applications. Trends Immunol. (2014) 35:581–90. 10.1016/j.it.2014.09.004 - DOI - PMC - PubMed

Publication types

Substances