Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec 11;37(24):4865-4867.
doi: 10.1093/bioinformatics/btab446.

ClusTCR: a python interface for rapid clustering of large sets of CDR3 sequences with unknown antigen specificity

Affiliations

ClusTCR: a python interface for rapid clustering of large sets of CDR3 sequences with unknown antigen specificity

Sebastiaan Valkiers et al. Bioinformatics. .

Abstract

Motivation: The T-cell receptor (TCR) determines the specificity of a T-cell towards an epitope. As of yet, the rules for antigen recognition remain largely undetermined. Current methods for grouping TCRs according to their epitope specificity remain limited in performance and scalability. Multiple methodologies have been developed, but all of them fail to efficiently cluster large datasets exceeding 1 million sequences. To account for this limitation, we developed ClusTCR, a rapid TCR clustering alternative that efficiently scales up to millions of CDR3 amino acid sequences, without knowledge about their antigen specificity.

Results: Benchmarking comparisons revealed similar accuracy of ClusTCR as compared to other TCR clustering methods, as measured by cluster retention, purity and consistency. ClusTCR offers a drastic improvement in clustering speed, which allows the clustering of millions of TCR sequences in just a few minutes through ultraefficient similarity searching and sequence hashing.

Availability and implementation: ClusTCR was written in Python 3. It is available as an anaconda package (https://anaconda.org/svalkiers/clustcr) and on github (https://github.com/svalkiers/clusTCR).

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Publication types