Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Dec 21:9:e61639.
doi: 10.7554/eLife.61639.

Early life imprints the hierarchy of T cell clone sizes

Affiliations

Early life imprints the hierarchy of T cell clone sizes

Mario U Gaimann et al. Elife. .

Abstract

The adaptive immune system responds to pathogens by selecting clones of cells with specific receptors. While clonal selection in response to particular antigens has been studied in detail, it is unknown how a lifetime of exposures to many antigens collectively shape the immune repertoire. Here, using mathematical modeling and statistical analyses of T cell receptor sequencing data, we develop a quantitative theory of human T cell dynamics compatible with the statistical laws of repertoire organization. We find that clonal expansions during a perinatal time window leave a long-lasting imprint on the human T cell repertoire, which is only slowly reshaped by fluctuating clonal selection during adult life. Our work provides a mechanism for how early clonal dynamics imprint the hierarchy of T cell clone sizes with implications for pathogen defense and autoimmunity.

Keywords: T cell immunity; fluctuating fitness; high-dimensional ecology; human; immune repertoire; immunology; imprinting; inflammation; physics of living systems; power-law scaling; repertoire sequencing.

Plain language summary

The human immune system develops a memory of pathogens that it encounters over its lifetime, allowing it to respond quickly to future infections. It does this partly through T cells, white blood cells that can recognize different pathogens. During an infection, the T cells that recognize the specific pathogen attacking the body will divide until a large number of clones of these T cells is available to help in the fight. After the infection clears, the immune system ‘keeps’ some of these cells so it can recognize the pathogen in the future, and respond quicker to an infection. Over the course of their lives, people will be infected by many different pathogens, leading to a wide variety of T cells that each respond to one of these pathogens. However, it is not well understood how various infections throughout the human lifespan shape the overall population of different T cells. Gaimann et al. used mathematical modelling to study how the composition of the immune system changes in people of different ages. Different populations of T cells – each specialized against a specific antigen – had been previously identified through genetic sequencing. Gaimann et al. analyzed their dynamics to show that many of the largest populations originate around birth, during the formation of the immune system. These findings suggest a potential mechanism for how exposure to pathogens in infancy can influence the immune system much later in life. The results may also explain variations in how people respond to infections and in their risk of developing autoimmune conditions. This understanding could help develop new treatments or interventions to guide the immune system as it develops.

PubMed Disclaimer

Conflict of interest statement

MG, MN, JD, AM No competing interests declared

Figures

Figure 1.
Figure 1.. Statistics of human T cell repertoire organization.
(A) T cells with highly diverse receptors are created from progenitor cells through genetic recombination (left), which then undergo clonal selection (middle) together shaping the immune repertoire. The T cell receptor (TCR) locus acts as a natural barcode for clonal lineages, which can be read out by sequencing (right). (B, C) Clone size distributions in two large cohort studies of human blood samples using disparate sequencing protocols display a power-law relationship between the rank and size of the largest clones. Each line shows the size distribution of all T cell clones in an individual in an unsorted blood sample, that is independently of the phenotypes of the cells making up the different clones. Ages are color coded as indicated in the legend. The black line shows a power law with a slope of -1 for visual comparison. Normalized clone sizes were defined as the number of reads of a given receptor’s sequence divided by the total number of reads within a sample and a factor equal to the average fraction of T cells with memory phenotype at different ages to account for variations in sampling depth and in the subset composition of peripheral blood, respectively (Figure 1—figure supplement 3). Only a single individual is displayed per 2-year age bracket to improve visibility. (D) Power-law exponents as a function of the age (legend: linear regression slope and coefficient of determination). Data sources: Britanova et al., 2016, Emerson et al., 2017.
Figure 1—figure supplement 1.
Figure 1—figure supplement 1.. Cohort age distributions.
Figure 1—figure supplement 2.
Figure 1—figure supplement 2.. Clone size distributions in phenotypically sorted T cell subsets.
(A) The naive cell fraction as determined by flow cytometry and the fraction of singletons are closely correlated in the Britanova cohort. To diminish the influence of sampling depth variations, we computationally subsampled all repertoires to an equal sample size of 5 · 105 counts. (B,C) Analysis of unsorted (TCR sequencing from all peripheral blood mononuclear cells), memory (CD3+, CD45+), and naive (CD3+, CD45RA+) blood samples from the same individual (Data source: Chu et al., 2019). (B) Clone size distributions in the different T cell compartments. Filtering naive clones that are also found in the memory compartment removes most large naive clones. (C) Frequency of large clones in the memory sample is shifted upwards relative to their frequency within the unsorted sample. Color represents logarithm of local kernel density estimate in regions with overplotting. The solid lines are guides to the eye (black line represents equal frequency, green line 2.6-fold higher frequency in the memory compartment). (D) The fraction of naive cells decreases with age (Data source: Britanova et al., 2016) starting in early infancy (Data source: Pediatric AIDS Clinical Trials Group et al., 2003). The legend shows the fitted time constant of exponential decay (± SE).
Figure 1—figure supplement 3.
Figure 1—figure supplement 3.. Influence of normalization procedure on clone size distributions.
(A,D) Raw clone size distributions show large variability due to different sample sizes. (B,E) A normalization by sampling depth removes much of this variation. (C,F) A normalization by the fraction of memory cells at different ages further collapses the tails of the clone size distributions.
Figure 1—figure supplement 4.
Figure 1—figure supplement 4.. Dependence of power-law exponent on age by cytomegalovirus (CMV) infection status and sex.
(A) Chronic infection with CMV drives large clonal expansions (Lindau et al., 2019; Sylwester et al., 2005). We thus repeated the analysis of Figure 1C separating individuals based on their CMV infection status (fitted lines shown in legend, regression results displayed as offset + slope · (age in years - 40)/10). Overall, CMV-positive individuals have a smaller α than uninfected individuals, which is independent of age. The average exponent in CMV-negative individuals decreases slowly with age, and in old age coincides those of CMV-positive individuals. Combining CMV infection status and age explained a significantly larger proportion of the variance in scaling exponents (17%) than age alone. (B) Many immune determinants differ markedly between the sexes (Klein and Flanagan, 2016). We thus analyzed whether α depends on sex. We find that the dependence on age is similar among the sexes, but men have on average a slightly smaller exponent than women indicating a more skewed repertoire organization. Data source: Emerson et al., 2017.
Figure 1—figure supplement 5.
Figure 1—figure supplement 5.. Clone size distributions in cordblood.
Each line shows the distribution in one individual. The black line shows a power law with a slope of -1 for visual comparison. The fitted power-law exponents α = 2.1 ± 0.1 (mean ± SE) are larger than in adult repertoires, but clone sizes are already remarkably broad. Data source: Britanova et al., 2016.
Figure 2.
Figure 2.. Emergence of power-law scaling of clone sizes in a minimal model of repertoire formation.
(A) Sketch of the stochastic dynamics of recruitment, proliferation, and death of T cells. Proliferation is inversely proportional to total repertoire size, which reflects increased competition as the repertoire grows. (B) Clone size distributions in simulated repertoires display power-law scaling (blue lines), in contrast to steady-state predictions that conform with those of a null model based only on demographic stochasticity (black line, Equation 48). (C) Illustration of the mechanism: early in life rates of proliferation exceed clonal turnover (lower panel). As the total repertoire size increases (gray line, upper panel) the proliferation rate decreases due to increased competition. The dynamics of selected clones after their recruitment marked by a dot is indicated by colored lines (upper panel). The line position shows the cumulative size of all prior clones, while the line width indicates the size of the clone (not to scale). The earlier a clones is recruited the larger it expands during the period of overall repertoire growth. (D) Dependence of the clone size distribution on parameters. Simulated repertoires at 5 years of age were subsampled to 106 cells to mimick the experimental sampling depth (solid lines). The simulated data closely follow predictions from a continuum theory of repertoire formation (dashed lines). Model parameters: (B,D) clonal death rate d = 0.2/year, clonal recruitment rate θ = 106/year, clone size at recruitment C0 = 1; (B) total proliferation rate b0 = 107/year (implying a recruitment to proliferation ratio γ = 0.1), (D) variable b0 as indicated in the legend by the ratio γ.
Figure 2—figure supplement 1.
Figure 2—figure supplement 1.. Analytical predictions for clone size distributions in a model with variable recruitment sizes.
Clone size distributions resulting from a variable recruitment size C(lognormal with standard deviation as indicated in legend) and repertoire growth (Equation 52). The black line shows a power law with a slope of -2.2 for visual comparison. Parameter: γ = 0.2.
Figure 2—figure supplement 2.
Figure 2—figure supplement 2.. Simulated clone size distributions in a model with saturation of proliferation rates.
Influence of a saturation of the proliferation rate, b=b0/(K+N), on the clone size distribution. The saturation induces a change of the scaling behavior at the largest clone sizes. Parameter: b0 = 2 · 104/year, d = 0.2/year, θ = 2 · 103/year (implying γ = 0.1), simulation length 5 years.
Figure 2—figure supplement 3.
Figure 2—figure supplement 3.. Simulated clone size distributions in a model with competition for clone-specific resources.
Clone size distributions in a simulated model where clones compete for specific antigens to which they bind with a probability pb. Model parameter: b0 = 104/year, θ = 103/year (implying γ = 0.1), Na = 1000, d = 0.2/year, simulation length 10 years.
Figure 3.
Figure 3.. Statistical dating of clones reveals that early expansions have a long-lasting effect.
(A) Genetic recombination of a TCR involves the choice of a V, D, and J region among multiple genomically-encoded templates as well as the deletion and insertion of nucleotides at both the VD and DJ junctions. The enzyme TdT, which is responsible for nucleotide insertions, is not expressed during early fetal development. This allows a statistical dating of clonal ages, as clones with zero insertions at both junctions constitute a much larger fraction of all clones during a fetal and perinatal time window. (B) Fraction (± SE) of clones with zero insertions as a function of age and clone size. Clones are binned by their size into non-overlapping bins (rank 1–500, 501–1000, and so on; upper values are indicated on the x-axis). (C) Same data as in B displayed with a rescaled x-axis using fitted parameters τd = 9.1 ± 0.5 years, r* = 1.2 ± 0.2 ⋅ 104. The data collapses onto a sigmoidal function predicted by theory (Equation 3) with fitted p0,- = 0.074 ± 0.004, p0,+ = 0.0187 ± 0.0005 (black line). Data source: Emerson et al., 2017.
Figure 3—figure supplement 1.
Figure 3—figure supplement 1.. Fraction of zero insertion clones within productive and unproductive sequences.
Sequences with zero insertions code for a particular subset of all possible T cell receptors (TCRs), and some of their enrichment might represent a peripheral selective advantage of this subset of receptors. We thus asked how the enrichment depends on whether the sequence used to define the clone represents a productive or unproductive rearrangement. An unproductive rearrangement, in which the recombination process introduces a frameshift or stop codon, can be rescued by a second productive rearrangement, but is not expressed and thus not selected upon. Under the adult recombination statistics an unproductive zero insertion sequence is likely to be paired with a productive sequence with many insertions, and thus we would not expect to see a similar enrichment for unproductive sequences if a general peripheral selective advantage was causing the enrichment. Data source: Emerson et al., 2017.
Figure 3—figure supplement 2.
Figure 3—figure supplement 2.. Enrichment of clones with known specificity.
(A) Fraction of clones with TCRs that have exact matches in the VDJdb (Shugay et al., 2018) of known antigen specificities. (B) Fraction of clones with close matches (defined as nearest neighbor sequences in a Levenshtein distance sense, that is sequences with a single amino acid substitution, insertion or deletion). T cells known to be specific to particular antigens are enriched among the most abundant clones. However, there is little change in this enrichment as a function of age. Data source: Emerson et al., 2017.
Figure 3—figure supplement 3.
Figure 3—figure supplement 3.. Enrichment of clones with high probability of generation.
Fraction of clones with T cell receptor (TCR) sequences σ with a probability of generation Pgen(σ) higher than 10-9. The probability of generation was calculated based on the nucleotide sequence using a probabilistic model of recombination with default parameters for human TCR sequences (Sethna et al., 2019). To remove confounding by the early expansionary dynamics, we excluded zero insertion clones as most of these clones also have high probability of generation. We find that clones with high Pgen are moderately more likely to be large. In comparison to the zero insertion clones, there is little change in their enrichment as a function of age. Data source: Emerson et al., 2017.
Figure 3—figure supplement 4.
Figure 3—figure supplement 4.. Fraction of zero insertion clones as function of age and clone size by cytomegalovirus (CMV) infection status.
In CMV positive individuals zero insertion clones are less enriched among the largest clones. Data source: Emerson et al., 2017.
Figure 4.
Figure 4.. The small magnitude of longitudinal clone size fluctuations implies a slow reordering of the clone size hierarchy.
(A,B) Longitudinal clonal dynamics in a healthy adult over a one year time span. (A) Fraction of the 1000 largest clones that fall within a specific clone size rank bin at the earliest time point. A small number of clones was not detected at all at the first time point (ND) likely representing recently expanded clones. All other clones were already among the largest clones initially. (B) Variance of log-foldchanges in clone size as a function of time difference for the 250 largest clones. (C) Fraction of clones with zero insertions as a function of age and clone size in a simulated cohort using a magnitude of clonal growth rate fluctuations inferred from the longitudinal data. Data source: Chu et al., 2019.
Figure 4—figure supplement 1.
Figure 4—figure supplement 1.. Provenance of large T cell clones for all subjects.
Longitudinal analysis of the origin of the 1000 largest clones at each time point (indicated by arrows) in three healthy adults over a 1-year time frame. For each clone, we determined whether it was also sampled at the earliest time point, and if so at what clone size. The plot displays the fraction of clones that fall within a specific clone size rank bin at the first time point. At all times, a majority of clones was already large initially. A small fraction was not detected at all at the first time point (ND) likely representing recently expanded clones. (Supplement to Figure 4A which corresponds to panel C.) Data source: Chu et al., 2019.
Figure 4—figure supplement 2.
Figure 4—figure supplement 2.. Dynamics of large persistent clones for all subjects.
Dynamics of the 250 largest clones from second time point onwards excluding those not sampled at the first time point. (A-C) Fraction of the repertoire represented by these clones (sum of their normalized clone sizes); (D-F) mean and (G-I) variance of the log-foldchanges of their normalized clone sizes relative to time point 2. (Supplement to Figure 4B which corresponds to panel I.) Data source: Chu et al., 2019.
Figure 4—figure supplement 3.
Figure 4—figure supplement 3.. Data collapse by parameter rescaling for the simulated cohort.
Same data as in Figure 3F displayed with a rescaled x-axis using fitted parameters τd=10.2±0.4years,r=1.19±0.08104. The data collapses onto a sigmoidal function predicted by theory (Equation 3) with fitted p0,-=0.0695±0.0012, p0,+=0.0198±0.0003 (black line).
Figure 5.
Figure 5.. Validation of the mean-field approximation.
Comparison of full stochastic simulations and simulations using mean-field competition. Parameter: b0=2104/year, d=0.2/year, θ=2103/year (implying γ = 0.1), simulation length 5 years.
Figure 6.
Figure 6.. Fluctuating fitness model out-of-steady state.
Analytical predictions for the clone size distributions in a geometric Brownian motion fluctuating fitness model (Integral of Equation 26) as a function of effective age τ=Tσ2. The black line shows the asymptotic prediction for the steady-state scaling. Parameter: α=1.2.
Appendix 1—figure 1.
Appendix 1—figure 1.. Estimated power-law exponents converge to correct value using trimming method.
Fitted exponent as a function of the cutoff choice in simulated data (errorbars ±2 SE over 50 independent draws). The fitted exponent changes drastically for small Cmin before leveling off indicating deviations from true power-law scaling at the smallest clone sizes. Such a deviation is expected due to subsampling despite the true power-law scaling in the underlying distribution (see text). Simulations: 107 clones were drawn from a discrete power-law distribution with α=2.15. A sample of size 5105 cells was then drawn from the underlying power law based on a Poisson (blue dots) or negative binomial sampling (orange and green dots show two choices of the overdispersion coefficient a).
Appendix 1—figure 2.
Appendix 1—figure 2.. Influence of choice of Cmin on fitted power-law exponent for empirical data.
Fitted exponent as a function of the cutoff choice (black lines: 50 random repertoires, blue line: mean) in the (A) Britanova et al. and (B) Emerson et al. datasets. The fitted exponent changes drastically for small Cmin before leveling off indicating deviations from true power-law scaling at the smallest clone sizes, similarly to those seen in simulated data (Figure 1). To alleviate the bias induced by finite sampling, we choose a cutoff value Cmin, for which the power-law exponent estimates have leveled off. For large Cmin the variance of fitted exponent increases as more and more data is excluded from the fit (A, B inset), which sets a practical upper bound for choosing Cmin.
Appendix 1—figure 3.
Appendix 1—figure 3.. Graphical display of subsampled power-law distributions.
(A-D) Show various ways of displaying clone size distributions obtained by subsampling an underlying clone size distribution consisting of 108 clones drawn according to P(C)C-2.2 to various sampling depths. (A) The empirical probability density function of clone sizes, (B) its cumulative density, as well as (C) the cumulative density of normalized clone sizes are not invariant under changes of the sampling depth. Only the tail behavior of relative frequencies of finding cells from large clones is reproducibly captured, which makes rank-frequency plots (displays of unnormalized cumulative distributions of normalized clone sizes) the method of choice for collapsing clone size distributions at various sampling depths.

References

    1. Ahmed R, Gray D. Immunological memory and protective immunity: understanding their relation. Science. 1996;272:54–60. doi: 10.1126/science.272.5258.54. - DOI - PubMed
    1. Akondy RS, Fitch M, Edupuganti S, Yang S, Kissick HT, Li KW, Youngblood BA, Abdelsamed HA, McGuire DJ, Cohen KW, Alexe G, Nagar S, McCausland MM, Gupta S, Tata P, Haining WN, McElrath MJ, Zhang D, Hu B, Greenleaf WJ, Goronzy JJ, Mulligan MJ, Hellerstein M, Ahmed R. Origin and differentiation of human memory CD8 T cells after vaccination. Nature. 2017;552:362–367. doi: 10.1038/nature24633. - DOI - PMC - PubMed
    1. Altan-Bonnet G, Mora T, Walczak AM. Quantitative immunology for physicists. Physics Reports. 2020;849:1–83. doi: 10.1016/j.physrep.2020.01.001. - DOI
    1. Antia R, Ganusov VV, Ahmed R. The role of models in understanding CD8+ T-cell memory. Nature Reviews Immunology. 2005;5:101–111. doi: 10.1038/nri1550. - DOI - PubMed
    1. Arstila TP, Casrouge A, Baron V, Even J, Kanellopoulos J, Kourilsky P. A direct estimate of the human alphabeta T cell receptor diversity. Science. 1999;286:958–961. doi: 10.1126/science.286.5441.958. - DOI - PubMed

Publication types