Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Mar:13:None.
doi: 10.1016/j.immuno.2024.100033.

A comparison of clustering models for inference of T cell receptor antigen specificity

Affiliations

A comparison of clustering models for inference of T cell receptor antigen specificity

Dan Hudson et al. Immunoinformatics (Amst). 2024 Mar.

Abstract

The vast potential sequence diversity of TCRs and their ligands has presented an historic barrier to computational prediction of TCR epitope specificity, a holy grail of quantitative immunology. One common approach is to cluster sequences together, on the assumption that similar receptors bind similar epitopes. Here, we provide the first independent evaluation of widely used clustering algorithms for TCR specificity inference, observing some variability in predictive performance between models, and marked differences in scalability. Despite these differences, we find that different algorithms produce clusters with high degrees of similarity for receptors recognising the same epitope. Our analysis strengthens the case for use of clustering models to identify signals of common specificity from large repertoires, whilst highlighting scope for improvement of complex models over simple comparators.

Keywords: Clustering models; Deorphanizing TCRs; T cell antigen specificity; T cell receptor repertoire analysis.

PubMed Disclaimer

Conflict of interest statement

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: D.H. provides consultancy services to companies active in T cell antigen discovery and vaccine development. The other authors declare no competing interests.

Figures

None
Graphical abstract
Fig. 1
Fig. 1
Supervised and unsupervised learning in T cell epitope specificity inference. (A) SPMs (left) fit a predictive function f(x) to training data having an independent variable X (TCR sequences and other features) and dependent variable y (epitopes or pMHC complexes). This function may then be applied to predict the cognate epitopes of orphan TCRs. UCMs (right) generate a mapping from TCR sequences to a cluster allocation, such that each TCR is assigned to one or more clusters having common epitope specificity. (B) Application of UCMs to de-orphanise TCRs by co-clustering.
Fig. 2
Fig. 2
Comparison of model performance across datasets. Weighted F1-scores shown for V10-V1000 grouped, (A) α and β chain selections combined; and (B) split by chain selection. Significance values: ****, p 0.0001; ***, p 0.001; **, p 0.01; *, p 0.05, n.s. = not significant.
Fig. 3
Fig. 3
CDR3β sequence logos for the largest clusters produced per epitope, dataset V1000. Logos were produced with WebLogo for TCRs in the largest cluster produced for a given epitope per model following sequence alignment with MUSCLE .
Fig. 4
Fig. 4
Investigating model scalability, comparing model runtimes as a function of the number of synthetic TCR sequences introduced with OLGA . All experiments conducted on dataset V1000 (β chain selection).

References

    1. Davis M.M., Bjorkman P.J. T-cell antigen receptor genes and T-cell recognition. Nature. 1988;334(6181):395–402. - PubMed
    1. Bosselut R. T cell antigen recognition: Evolution-driven affinities. Proc Natl Acad Sci USA. 2019;116(44):21969–21971. - PMC - PubMed
    1. Sckisel G.D., Bouchlaka M.N., Monjazeb A.M., Crittenden M., Curti B.D., Wilkins D.E., et al. Out-of-sequence signal 3 paralyzes primary CD4(+) T-cell-dependent immunity. Immunity. 2015;43(2):240–250. - PMC - PubMed
    1. Joglekar A.V., Li G. T cell antigen discovery. Nature Methods. 2021;18(8):873–880. - PubMed
    1. Valkiers S., de Vrij N., Gielis S., Verbandt S., Ogunjimi B., Laukens K., et al. Recent advances in T-cell receptor repertoire analysis: Bridging the gap with multimodal single-cell RNA sequencing. Immunoinformatics. 2022;5

LinkOut - more resources