Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2018 Feb 21:9:224.
doi: 10.3389/fimmu.2018.00224. eCollection 2018.

Computational Strategies for Dissecting the High-Dimensional Complexity of Adaptive Immune Repertoires

Affiliations
Review

Computational Strategies for Dissecting the High-Dimensional Complexity of Adaptive Immune Repertoires

Enkelejda Miho et al. Front Immunol. .

Abstract

The adaptive immune system recognizes antigens via an immense array of antigen-binding antibodies and T-cell receptors, the immune repertoire. The interrogation of immune repertoires is of high relevance for understanding the adaptive immune response in disease and infection (e.g., autoimmunity, cancer, HIV). Adaptive immune receptor repertoire sequencing (AIRR-seq) has driven the quantitative and molecular-level profiling of immune repertoires, thereby revealing the high-dimensional complexity of the immune receptor sequence landscape. Several methods for the computational and statistical analysis of large-scale AIRR-seq data have been developed to resolve immune repertoire complexity and to understand the dynamics of adaptive immunity. Here, we review the current research on (i) diversity, (ii) clustering and network, (iii) phylogenetic, and (iv) machine learning methods applied to dissect, quantify, and compare the architecture, evolution, and specificity of immune repertoires. We summarize outstanding questions in computational immunology and propose future directions for systems immunology toward coupling AIRR-seq with the computational discovery of immunotherapeutics, vaccines, and immunodiagnostics.

Keywords: B-cell receptor; T-cell receptor; antibody discovery; artificial intelligence; immunogenomics; networks; phylogenetics; systems immunology.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The immune repertoire space is defined by diversity, architecture, evolution, and convergence. (A) Diversity measurements are based on (i) the accurate annotation of V (D) J segments using deterministic and probabilistic approaches with population-level or individualized germline gene reference databases. (ii) Probabilistic and hidden Markov models allow inference of recombination statistics. (iii) Measurement of clonotype diversity using diversity profiles. (B) Analysis of repertoire architecture relies predominantly on (i) clonal networks that are constructed by connecting nucleotide or amino acid sequence nodes by similarity edges. The sequence similarity between clones is defined via a string distance [e.g., Levenshtein distance (LD)], resulting in undirected Boolean networks for a given threshold (nucleotides/amino acids). An example of the global characterization of the network is the diameter, shown by black edges. An example of the local parameters of the network is the degree (n = 1) related to the individual clonal node in black. (ii) Degree distribution is a global characteristic of immune repertoire networks, which can be used for analyzing clonal expansion. (iii) Several similarity layers decompose the immune repertoire along its similarity layers. Layer D1 captures clonal nodes similar by edit distance 1 (1 nt/a.a. different), D2 of distance 2 and so forth. (C) Assessing evolution of antibody lineages. (i) Reconstruction of phylogenetic trees. Stars indicate somatic hypermutation. (ii) Probabilistic methods for the inference of mutation statistics in antibody lineage evolution. (iii) Simulation of antibody repertoire evolution for benchmarking antibody-tailored phylogenetic inference algorithms. (D) Naive and antigen-driven cross-individual sequence similarity and convergence in immune repertoires. (i) The Venn diagram shows sequences shared in the two repertoires (circles). Signature-like sequence features are highlighted by black squares. (ii) Database of convergent or antigen-specific immune receptor sequences. (iii) K-mer sequence decomposition and classification of immune receptor sequences.
Figure 2
Figure 2
An overview of selected computational tools used in immune repertoire analyses. Each horizontal colored bar colored bar in the Basis column represents a unique antibody or T-cell receptor (TCR) sequence. Vertical red bars represent sequence differences or somatic hypermutation. The Method column describes the general concept of the computational methods and how these are applied to immune repertoires. The Tools column highlights exemplary key resources for performing computational analysis in the respective analytical sections [rows (A–D)].

Similar articles

Cited by

References

    1. Tonegawa S. Somatic generation of antibody diversity. Nature (1983) 302:575–81.10.1038/302575a0 - DOI - PubMed
    1. Wardemann H, Busse CE. Novel approaches to analyze immunoglobulin repertoires. Trends Immunol (2017) 38(7):471–82.10.1016/j.it.2017.05.003 - DOI - PubMed
    1. Glanville J, Zhai W, Berka J, Telman D, Huerta G, Mehta GR, et al. Precise determination of the diversity of a combinatorial antibody library gives insight into the human immunoglobulin repertoire. Proc Natl Acad Sci U S A (2009) 106:20216–21.10.1073/pnas.0909775106 - DOI - PMC - PubMed
    1. Elhanati Y, Sethna Z, Marcou Q, Callan CG, Mora T, Walczak AM. Inferring processes underlying B-cell repertoire diversity. Phil Trans R Soc Lond B Biol Sci (2015) 370:20140243.10.1098/rstb.2014.0243 - DOI - PMC - PubMed
    1. Murugan A, Mora T, Walczak AM, Callan CG. Statistical inference of the generation probability of T-cell receptors from sequence repertoires. Proc Natl Acad Sci U S A (2012) 109:16161–6.10.1073/pnas.1212755109 - DOI - PMC - PubMed

Publication types

MeSH terms

Substances