Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar 10;23(2):bbab566.
doi: 10.1093/bib/bbab566.

TCRpower: quantifying the detection power of T-cell receptor sequencing with a novel computational pipeline calibrated by spike-in sequences

Affiliations

TCRpower: quantifying the detection power of T-cell receptor sequencing with a novel computational pipeline calibrated by spike-in sequences

Shiva Dahal-Koirala et al. Brief Bioinform. .

Abstract

T-cell receptor (TCR) sequencing has enabled the development of innovative diagnostic tests for cancers, autoimmune diseases and other applications. However, the rarity of many T-cell clonotypes presents a detection challenge, which may lead to misdiagnosis if diagnostically relevant TCRs remain undetected. To address this issue, we developed TCRpower, a novel computational pipeline for quantifying the statistical detection power of TCR sequencing methods. TCRpower calculates the probability of detecting a TCR sequence as a function of several key parameters: in-vivo TCR frequency, T-cell sample count, read sequencing depth and read cutoff. To calibrate TCRpower, we selected unique TCRs of 45 T-cell clones (TCCs) as spike-in TCRs. We sequenced the spike-in TCRs from TCCs, together with TCRs from peripheral blood, using a 5' RACE protocol. The 45 spike-in TCRs covered a wide range of sample frequencies, ranging from 5 per 100 to 1 per 1 million. The resulting spike-in TCR read counts and ground truth frequencies allowed us to calibrate TCRpower. In our TCR sequencing data, we observed a consistent linear relationship between sample and sequencing read frequencies. We were also able to reliably detect spike-in TCRs with frequencies as low as one per million. By implementing an optimized read cutoff, we eliminated most of the falsely detected sequences in our data (TCR α-chain 99.0% and TCR β-chain 92.4%), thereby improving diagnostic specificity. TCRpower is publicly available and can be used to optimize future TCR sequencing experiments, and thereby enable reliable detection of disease-relevant TCRs for diagnostic applications.

Keywords: T-cell receptor; TCRpower and adaptive immune receptor repertoire sequencing; bulk T-cell receptor sequencing; computational model; spike-in standards.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Study design. Our study presents a detection power calculator based on a computational model of TCR RNA read count in bulk sequencing data to enable efficient TCR sampling and RNA sequencing. Our model has two components [1] modeling the number of target TCR in a peripheral blood sample and [2] modeling the number of target TCR RNA sequencing reads based on the number of sampled target TCR. To calibrate our model, we mixed RNA from T cells with known (spike-in) concentration, together with RNA from CD4 effector memory T cells with unknown TCRs, and sequenced TCR from these mixtures using a 5′ RACE protocol. To investigate how library preparation choices affect detection power, we created three sequencing sets with different library preparation approaches. As controls, we performed TCR sequencing on the spike-in RNA mix only (Control spike-in TCC mix) and RNA of the effector memory CD4 T cells only (Control CD4 TEM). Created with Biorender.com.
Figure 2
Figure 2
Accuracy and variability in TCR frequency measurement. (A) Ground truth versus measured TCR frequency of the spike-in TCR for experimental Sets 1–3 for all 6 replicates, showing consistent linear relationships. (B) TCR frequency dispersion index (std divided by mean) across 6 replicates of each TCR. The lower frequency TCRs have higher dispersion (R2) than the higher frequency TCRs. Some low-frequency TCRs are undetected (marked by X) either for a certain replicate (panel A) or across all six replicates (panel B).
Figure 3
Figure 3
Negative binomial modeling of the spike-in TCR read counts and detection limit estimation. (A) TRA (red) and TRB (blue) read count versus spike-in TCR frequency under three experimental conditions (Sets 1–3). The dots represent measured read counts, whereas the green line and areas are the respective mean and 95% prediction interval of negative binomial models with read efficiency parameter re and mean-variance relationship parameters η, λ, fitted by maximum likelihood. Tread = the total TRA or TRB read count for the set. (B) Estimated detection probability (read count > 0) as a function of TCR frequency, along with the minimal fraction (dashed line) that can be detected with at least 95% probability. Note that for TRA, Set 3 stands out with the lowest η value (i.e. variance) and 95% detection probability.
Figure 4
Figure 4
Falsely detected sequences in the Control Spike-in TCC set. (A) Count distributions of TRA and TRB reads that did not match the TCRs of the spike-in TCCs, but were nevertheless detected in the Control Spike-in TCC set. All of the falsely detected sequences were either present in other sets in the library (purple), or only exclusively found in the Control Spike-in TCC set (green). The dotted lines represent the read cutoff [18] (B) Read count total over all CD4 TEM containing sets versus read counts in the Control Spike-in TCC Mix for the falsely detected sequences with read count >18 in the Control Spike-in TCC Mix. Note the linear trend characteristic of the index-hopping phenomenon.
Figure 5
Figure 5
Detection power estimation. Example output from our power calculator TCRpower, showing the probability of detecting a TCR with frequency 10–4 and read count threshold 18, as a function of the number of sampled TCR and sequencing reads.

Similar articles

Cited by

References

    1. Robins HS, Campregher PV, Srivastava SK, et al. Comprehensive assessment of T-cell receptor β-chain diversity in αβ T cells. Blood 2009;114(19):4099–107. - PMC - PubMed
    1. Warren RL, Freeman JD, Zeng T, et al. Exhaustive T-cell repertoire sequencing of human peripheral blood samples reveals signatures of antigen selection and a directly measured repertoire size of at least 1 million clonotypes. Genome Res 2011;21(5):790–7. - PMC - PubMed
    1. Dupic T, Marcou Q, Walczak AM, et al. Genesis of the αβ T-cell receptor. PLoS Comput Biol 2019;15(3):e1006874. - PMC - PubMed
    1. Manojlović LM. Photometry-based estimation of the total number of stars in the Universe. Appl Optics 2015;54(21):6589–91. - PubMed
    1. Liu X, Zhang W, Zhao M, et al. T cell receptor β repertoires as novel diagnostic markers for systemic lupus erythematosus and rheumatoid arthritis. Ann Rheum Dis 2019;78(8):1070–8. - PubMed

Publication types

MeSH terms

Substances