Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jul 23:9:1698.
doi: 10.3389/fimmu.2018.01698. eCollection 2018.

Structurally Mapping Antibody Repertoires

Affiliations

Structurally Mapping Antibody Repertoires

Konrad Krawczyk et al. Front Immunol. .

Abstract

Every human possesses millions of distinct antibodies. It is now possible to analyze this diversity via next-generation sequencing of immunoglobulin genes (Ig-seq). This technique produces large volume sequence snapshots of B-cell receptors that are indicative of the antibody repertoire. In this paper, we enrich these large-scale sequence datasets with structural information. Enriching a sequence with its structural data allows better approximation of many vital features, such as its binding site and specificity. Here, we describe the structural annotation of antibodies pipeline that maps the outputs of large Ig-seq experiments to known antibody structures. We demonstrate the viability of our protocol on five separate Ig-seq datasets covering ca. 35 m unique amino acid sequences from ca. 600 individuals. Despite the great theoretical diversity of antibodies, we find that the majority of sequences coming from such studies can be reliably mapped to an existing structure.

Keywords: B-cell receptor; antibody specificity; bioinformatics tools; next-generation sequencing; protein; structural homology.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The structural annotation of antibodies algorithm. The input consists of amino acid sequences in FASTA format. These sequences are Chothia-numbered using ANARCI (29). Chothia-numbered sequences are then aligned to known structures of antibodies as defined by the structural antibody database (25). Best templates are identified for the entire variable region as well as for Chothia-delimited framework only. The full variable region templates are employed to define complementarity determining region (CDR) anchoring residues that are used as input to FREAD which determines if we can identify a suitable template for each of the CDRs.
Figure 2
Figure 2
Chothia-aligning the 13.5 m unique baseline antibody variable sequences in datasets UCB_H and UCB_L to antibodies with known structures. (A) Full variable region sequence of the heavy chain. (B) Framework of the heavy chain. (C) Full variable region of the light chain. (D) Framework of the light chain. The pink bars indicate the number of sequences (right-hand y-axis) whose highest sequence identity structure match has the sequence identity given on the x-axis. The blue line (left-hand y-axis) indicates the expected root mean square deviation (RMSD) of a model built using a sequence identity match of that quality (with vertical SD error bars). For example, 80% sequence identity for the framework of the heavy chain translates to a 0.8 Å expected model RMSD.
Figure 3
Figure 3
Example of how structural mapping provides clues to antibody specificity. Structural annotation of antibodies (SAAB) outputs the Protein Data Bank (PDB) codes used to map frameworks, full variable sequence, and each of the complementarity determining regions for a sequence. The PDB codes are also mapped to the antigens recognized by the antibody structures (as stored in structural antibody database). If sequences match to similar PDB structures this could be indicative of similar binding sites and thus specificity. As an example, we examined the top 10 PDBs that were used to map H3 in the FLU dataset. A total of more than 7k H3 sequences were mapped to 4m5z, a complex of an antibody with influenza hemagglutinin (this is not among the top 10 H3-mapped PDBs in our other datasets). We show several sequence-diverse H3 loops on the left, which are unlikely to be grouped together by sequence-only methods. However, SAAB identifies that they are all likely to share a similar structure to the H3 loop of 4m5z (right, in blue) and, therefore, perhaps similar specificity.

References

    1. Almagro JC, Teplyakov A, Luo J, Sweet RW, Kodangattil S, Hernandez-Guzman F, et al. Second antibody modeling assessment (AMA-II). Proteins (2014) 82:1553–62.10.1002/prot.24567 - DOI - PubMed
    1. Glanville J, Zhai W, Berka J, Telman D, Huerta G, Mehta GR, et al. Precise determination of the diversity of a combinatorial antibody library gives insight into the human immunoglobulin repertoire. Proc Natl Acad Sci U S A (2009) 106:20216–21.10.1073/pnas.0909775106 - DOI - PMC - PubMed
    1. Fanning LJ, Connor AM, Wu GE. Development of the immunoglobulin repertoire. Clin Immunol Immunopathol (1996) 79:1–14.10.1006/clin.1996.0044 - DOI - PubMed
    1. Perelson AS, Oster GF. Theoretical studies of clonal selection: minimal antibody repertoire size and reliability of self-non-self discrimination. J Theor Biol (1979) 81:645–70.10.1016/0022-5193(79)90275-3 - DOI - PubMed
    1. Tonegawa S. Somatic generation of antibody diversity. Nature (1983) 302:575–81.10.1038/302575a0 - DOI - PubMed

LinkOut - more resources