Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2015 Nov 20:7:121.
doi: 10.1186/s13073-015-0243-2.

Practical guidelines for B-cell receptor repertoire sequencing analysis

Affiliations
Review

Practical guidelines for B-cell receptor repertoire sequencing analysis

Gur Yaari et al. Genome Med. .

Abstract

High-throughput sequencing of B-cell immunoglobulin repertoires is increasingly being applied to gain insights into the adaptive immune response in healthy individuals and in those with a wide range of diseases. Recent applications include the study of autoimmunity, infection, allergy, cancer and aging. As sequencing technologies continue to improve, these repertoire sequencing experiments are producing ever larger datasets, with tens- to hundreds-of-millions of sequences. These data require specialized bioinformatics pipelines to be analyzed effectively. Numerous methods and tools have been developed to handle different steps of the analysis, and integrated software suites have recently been made available. However, the field has yet to converge on a standard pipeline for data processing and analysis. Common file formats for data sharing are also lacking. Here we provide a set of practical guidelines for B-cell receptor repertoire sequencing analysis, starting from raw sequencing reads and proceeding through pre-processing, determination of population structure, and analysis of repertoire properties. These include methods for unique molecular identifiers and sequencing error correction, V(D)J assignment and detection of novel alleles, clonal assignment, lineage tree construction, somatic hypermutation modeling, selection analysis, and analysis of stereotyped or convergent responses. The guidelines presented here highlight the major steps involved in the analysis of B-cell repertoire sequencing data, along with recommendations on how to avoid common pitfalls.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
An overview of repertoire sequencing data production. The B-cell immunoglobulin receptor (BCR) is composed of two identical heavy chains (generated by recombination of V, D and J segments), and two identical light chains (generated by recombination of V and J segments). The large number of possible V(D)J segments, combined with additional (junctional) diversity introduced by stochastic nucleotide additions/deletions at the segment junctions (particularly in the heavy chain), lead to a theoretical diversity of >1014. Further diversity is introduced into the BCR during adaptive immune responses, when activated B cells undergo a process of somatic hypermutation (SHM). SHM introduces point mutations into the DNA coding for the BCR at a rate of ~10−3 per base pair per division [119, 120]. B cells accumulating mutations that improve their ability to bind pathogens are preferentially expanded in a process known as affinity maturation. The biology underlying these processes has been reviewed previously [121]. BCR repertoire sequencing (Rep-seq) experiments can be carried out on mRNA (shown here) or genomic DNA. Sequencer image: A MiSeq from Illumina/Konrad Förstner/Wikimedia Commons/Public Domain. 5′ RACE 5′ rapid amplification of cDNA ends, UMI unique molecular identifier, 5′ UTR 5′ untranslated region
Fig. 2
Fig. 2
The essential steps in repertoire sequencing analysis. Repertoire sequencing (Rep-seq) analysis can be divided into three stages: pre-processing; inference of B-cell population structure; and detailed repertoire analysis. Pre-processing transforms the next-generation sequencing reads into error-corrected B-cell immunoglobulin receptor (BCR) sequences, which are then aligned to identify the V(D)J germline genes. Next, the dynamic population structure of the BCR repertoire is inferred. Finally, quantitative features of the B-cell repertoire are calculated. MID multiplex identifier, SHM somatic hypermutation
Fig. 3
Fig. 3
Example outcomes of repertoire sequencing analysis. a A violin plot comparing the distribution of somatic mutation frequencies (across B-cell immunoglobulin receptor (BCR) sequences) between two repertoires. b The observed mutation frequency at each position in the BCR sequence, with the complementarity determining regions (CDRs) indicated by shaded areas. c Comparing the diversity of two repertoires by plotting Hill curves using Change-O [31]. d A “hedgehog” plot of estimated mutabilities for DNA motifs centered on the base cytosine (C), with coloring used to indicate traditional hot- and coldspots. e A lineage tree with superimposed selection strength estimates calculated using BASELINe [110]. f Pie chart depicting V segment usage for a single repertoire. g Comparison of selection strengths in two repertoires by plotting the full probability density function for the estimate of selection strength (calculated using BASELINe) for the CDR (top) and framework region (FWR; bottom). h Stream plot showing how clones expand and contract over time. i V segment genotype table for seven individuals determined using TIgGER [57]

Similar articles

Cited by

References

    1. Boyd SD, Joshi SA. High-throughput DNA sequencing analysis of antibody repertoires. Microbiol Spectr. 2014;2. doi: 10.1128/microbiolspec.AID-0017-20. - PubMed
    1. Robins H. Immunosequencing: applications of immune repertoire deep sequencing. Curr Opin Immunol. 2013;25(5):646–52. doi: 10.1016/j.coi.2013.09.017. - DOI - PubMed
    1. Arnaout R, Lee W, Cahill P, Honan T, Sparrow T, Weiand M, et al. High-resolution description of antibody heavy-chain repertoires in humans. PLoS One. 2011;6(8):22365. doi: 10.1371/journal.pone.0022365. - DOI - PMC - PubMed
    1. Galson JD, Trück J, Fowler A, Münz M, Cerundolo V, Pollard AJ, et al. In-Depth Assessment of Within-Individual and Inter-Individual Variation in the B Cell Receptor Repertoire. Front. Immunol. 2015;6:1–13. doi: 10.3389/fimmu.2015.00531. - DOI - PMC - PubMed
    1. Boyd SD, Gaeta BA, Jackson KJ, Fire AZ, Marshall EL, Merker JD, et al. Individual variation in the germline Ig gene repertoire inferred from variable region gene rearrangements. J Immunol. 2010;184(12):6986–92. doi: 10.4049/jimmunol.1000445. - DOI - PMC - PubMed

MeSH terms

Substances