Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Dec:463:137-147.
doi: 10.1016/j.jim.2018.10.003. Epub 2018 Oct 10.

Sequencing the peripheral blood B and T cell repertoire - Quantifying robustness and limitations

Affiliations

Sequencing the peripheral blood B and T cell repertoire - Quantifying robustness and limitations

Joel S Simon et al. J Immunol Methods. 2018 Dec.

Abstract

The adaptive immune response generates a large repertoire of T cells with T-cell receptors (TCRalpha and TCRbeta) and B cells with immunoglobulins (Ig). The repertoire changes in response to antigen stimulation both through amplification of specific cells (clonal expansion) as well as somatic hypermutation of immunoglobulins. Alterations of the immune repertoire have been observed in response to acute disease, such as external pathogens, or chronic diseases, such as autoimmunity and cancer. Here we establish experimental and analytical protocols for quantifying the peripheral blood of healthy human individuals by profiling the immune repertoire for the Complementarity determining region 3 (CDR3) of the variable regions of TCRbeta (CDRβ3) and the IgG heavy chain (CDRH1, CDRH2, CDRH3). The results demonstrate that 40 ml of blood are sufficient to reliably capture the 10,000 most common TCRbeta and 1000 most common IgG and determine their relative frequency in the circulation. We conclude that by using an accessible sample size of human PBMC one is able to robustly monitor alterations in the immune repertoire.

Keywords: B cell receptor; Cancer; Immune repertoire; T cell receptor.

PubMed Disclaimer

Figures

Figure 1 -
Figure 1 -. Experimental Setup:
An overview of sample processing. Donor-A gave ten 10ml draws of blood and Donor-B gave three. All blood draws were taken at the same time and each processed into a corresponding sample RNA pool. Aliquots of the RNA are used to create the amplified PCR product, aliquots of which are used to make multiple sequence libraries. Multiple libraries from the same sample RNA pool are compared to evaluate the coverage of the RNA-populations and multiple libraries from different sample RNA pools are compared to evaluate the Total-population.
Figure 2
Figure 2. Comparison of the abundance of clonotypes in samples.
The log of the abundance of the clonotypes from different libraries was plotted as a heat map. Top row: Comparison of two samples if IgG. Bottom row: Comparison of two samples of TCRbeta. Left column: Two libraries generated from the same RNA pool. Middle column: Two libraries generated from different pools of RNA from donor A. Right column: Two libraries generated from different RNA pools of donor B.
Figure 3
Figure 3. Plotting pairwise library overlap:
The clonotype overlap and intersection of randomly subsampled library reads. Only clonotypes of abundance greater than five are visualized. Upper left, IgG from the same RNA sample, upper right, IgG from two different RNA samples. Lower left, TCRbeta from the same RNA sample, lower right, TCRbeta from two different RNA samples.
Figure 4
Figure 4. Number of reads and clonotypes in all libraries:
The number of clonotypes and the number of reads for the IgG and the TCRbeta was quantified, from left to right, for all IgG (left), TCRbeta, only the CDR3 region of IgG, or increasing the number of reads for CDR3 four-fold. Increasing the number of reads of the CDR3 region of IgG (IgG-SE-CDR3) had an insignificant effect on the total number of clonotypes.
Figure 5:
Figure 5:. Library Saturation Curves.
Groups of libraries of varying sizes are compared to the merger of all libraries. The percent of clonotypes from the total pool that exist within a merged sub-group is measured for each group size. For each group size all possible combinations are averaged. On the plot of TCRbeta saturation the top 100 (green) and top 1000 (orange) completely overlap.
Figure 6:
Figure 6:. Group Overlap Saturation Curves.
The overlap is measured at each range of UMI with exponentially increases ranges.

The overlap at a range is the percent of clonotypes in that range that are present in the other library. If an overlap between ranges was taken exclusively it would miss some clonotypes with abundances differing by one. For varying subset sizes, all possible library subsets of that size are merged. Error bars indicate standard deviation.
Figure 7:
Figure 7:. Clonotype Abundance Distributions:
A histogram of the number of clonotypes that are found within each barcode abundance range. Displayed for IgG and TCRbeta clonotypes and averaged across all ten libraries. The IGG clonotype distribution is skewed towards those of mid-range abundance, with significantly fewer high abundance. Error bars correspond to standard deviation.
Figure 8:
Figure 8:. Histograms of libraries Occurrence.
A histogram showing the distribution of clonotypes of a single read, or 2, 4, 8, 16, 32 or 64 reads per clonotype as a function of the number of libraries in which they were found. For the TCRbeta, at 8 reads per clonotype and above, they were almost all found in all ten libraries.
Figure 9 -
Figure 9 -. Shared barcode Sequence Similarity Histograms:
Identical Bar codes from two different libraries were compared for the similarity of the coding sequence of their RNA. Similarity was quantified with a histograms of the normalized edit distance. If the normalized sequence distance is 0, then the sequences are identical. If identical bar codes from two different libraries also have identical coding sequences (peak at 0), those bar codes were characterized as contamination and discarded. As a control, when a matching was done of randomly matched bar codes (blue histogram), there were no values at 0. The other peaks correspond to sequences that share common regions of the V, D or J genes. The libraries that showed the highest degree of contamination (peak at 0) are shown for (Left) the pair of TCRbeta libraries (B3 & B10) and (Right) the pair of IgG library (B6 & B9). The fraction of each pairing of libraries that has values at zero is shown for the complete set of matches between libraries in Figure 10.
Figure 10
Figure 10. Pairwise contamination.
A heat map showing the percentages of sequences of IgG (left) or TCRbeta (right) believed to be contamination between different libraries. A read is considered contaminated between libraries if has the same barcode and full sequences nucleotide normalized edit distance of less than 0.05 (histogram values at zero, see Figure 9). The heat map is the percentage of the reads in a library on the horizontal access that were viewed as contamination from the library on the vertical access. The percentage of total sequence varies since each library has a different number of reads. All reads deemed contamination, by this criteria, where removed from the library and not used in the analysis.
Figure 11:
Figure 11:. A paired overlap comparison with and without UMI.
To further view the effect of UMI during the data processing stage of analysis, an IgG library processed with UMI was compared to itself without them. This was done by simply ignoring the UMI and jumping directly to the merge reads step in data processing pipeline. Instead of filtering UMI by number of reads, each read was passed through the default quality check. A large divergence was observed, only 37.6% of the clonotypes were shared (Jaccard index).

References

    1. Bashford-Rogers RJ, Palser AL, Idris SF, Carter L, Epstein M, Callard RE, Douek DC, Vassiliou GS, Follows GA, Hubank M, and Kellam P 2014. Capturing needles in haystacks: a comparison of B-cell receptor sequencing methods. BMC Immunol. 15:29. - PMC - PubMed
    1. Breden F, Luning Prak ET, Peters B, Rubelt F, Schramm CA, Busse CE, Vander Heiden JA, Christley S, Bukhari SAC, Thorogood A, Matsen Iv FA, Wine Y, Laserson U, Klatzmann D, Douek DC, Lefranc MP, Collins AM, Bubela T, Kleinstein SH, Watson CT, Cowell LG, Scott JK, and Kepler TB 2017. Reproducibility and Reuse of Adaptive Immune Receptor Repertoire Data. Front Immunol. 8:1418. - PMC - PubMed
    1. Burgos JD, and Moreno-Tovar P 1996. Zipf-scaling behavior in the immune system. Biosystems. 39:227–232. - PubMed
    1. DeWitt WS 3rd, Mesin L, Victora GD, Minin VN, and Matsen F.A.t. 2018. Using genotype abundance to improve phylogenetic inference. Mol Biol Evol. - PMC - PubMed
    1. Egorov ES, Merzlyak EM, Shelenkov AA, Britanova OV, Sharonov GV, Staroverov DB, Bolotin DA, Davydov AN, Barsova E, Lebedev YB, Shugay M, and Chudakov DM 2015. Quantitative profiling of immune repertoires for minor lymphocyte counts using unique molecular identifiers. Journal of immunology. 194:6155–6163. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources