Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Sep 11:3:315.
doi: 10.3389/fmicb.2012.00315. eCollection 2012.

Somatic Populations of PGT135-137 HIV-1-Neutralizing Antibodies Identified by 454 Pyrosequencing and Bioinformatics

Affiliations

Somatic Populations of PGT135-137 HIV-1-Neutralizing Antibodies Identified by 454 Pyrosequencing and Bioinformatics

Jiang Zhu et al. Front Microbiol. .

Abstract

Select HIV-1-infected individuals develop sera capable of neutralizing diverse viral strains. The molecular basis of this neutralization is currently being deciphered by the isolation of HIV-1-neutralizing antibodies. In one infected donor, three neutralizing antibodies, PGT135-137, were identified by assessment of neutralization from individually sorted B cells and found to recognize an epitope containing an N-linked glycan at residue 332 on HIV-1 gp120. Here we use next-generation sequencing and bioinformatics methods to interrogate the B cell record of this donor to gain a more complete understanding of the humoral immune response. PGT135-137-gene family specific primers were used to amplify heavy-chain and light-chain variable-domain sequences. Pyrosequencing produced 141,298 heavy-chain sequences of IGHV4-39 origin and 87,229 light-chain sequences of IGKV3-15 origin. A number of heavy and light-chain sequences of ∼90% identity to PGT137, several to PGT136, and none of high identity to PGT135 were identified. After expansion of these sequences to include close phylogenetic relatives, a total of 202 heavy-chain sequences and 72 light-chain sequences were identified. These sequences were clustered into populations of 95% identity comprising 15 for heavy chain and 10 for light chain, and a select sequence from each population was synthesized and reconstituted with a PGT137-partner chain. Reconstituted antibodies showed varied neutralization phenotypes for HIV-1 clade A and D isolates. Sequence diversity of the antibody population represented by these tested sequences was notably higher than observed with a 454 pyrosequencing-control analysis on 10 antibodies of defined sequence, suggesting that this diversity results primarily from somatic maturation. Our results thus provide an example of how pathogens like HIV-1 are opposed by a varied humoral immune response, derived from intrinsic mechanisms of antibody development, and embodied by somatic populations of diverse antibodies.

Keywords: HIV-1; N-linked glycan; antibody bioinformatics; high-throughput sequencing; immunity.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Sequence variation as a consequence of 454 pyrosequencing for ten plasmid-control antibodies. To quantify sequencing error, ten antibodies, input as purified plasmid DNA, were subjected to 454 pyrosequencing. Tested plasmid antibodies included VRC01, VRC03, VRC-PG04, VRC-CH31, VRC-CH33, a codon-optimized version of inferred, reverted unmutated ancestor of VRC-PG04 (termed VRC-PG04cog), gVRC-H3d74, gVRC-H6d74, gVRC-H12d74, and gVRC-H15d74. Heavy chain sequences are plotted as a function of sequence identity to the plasmid antibody (vertical axes) and of sequence divergence from their germline gene allele, IGHV1-2*02 (horizontal axes). The sequencing data used for divergence/identity analysis was processed by the standard bioinformatics pipeline without the error-correction step. Color coding indicates the number of sequences. For VRC01 and VRC03, additional contour plots displaying the estimated mutational error range (one root-mean-square deviation, 1.38% for VRC01 group and 1.26% for VRC03 group) have been shaded red around the input antibody.
Figure 2
Figure 2
Repertoire of donor 39 heavy-chain variable-domain sequences of IGHV4-39 origin determined by 454 pyrosequencing. After processing by a standard bioinformatics pipeline (see Materials and Methods), 1,41,298 full-length, heavy-chain variable-domain sequences from IGHV4-39 germline family were obtained. These are plotted as a function of sequence identity to the heavy-chain variable domain of PGT135 (A), PGT136 (B), and PGT137 (C) and of sequence divergence from inferred IGHV4-39 germline allele. Color coding indicates the number of sequences. The 10%-identity gap indicates that the sequences within the upper island in 2C are somatic variants of PGT137 and not caused by sequencing errors.
Figure 3
Figure 3
Evolutionary similarity of PGT135–137 to donor 39 heavy-chain variable-domain sequences. Germline-rooted maximum-likelihood tree of PGT135–137 and 202 sequences identified by the iterative intra-donor phylogenetic analysis of donor 39 heavy-chain variable domain sequences determined by 454 pyrosequencing. The iterative intra-donor phylogenetic analysis was based on an implementation of neighbor-joining (NJ) method. Collapsed branches are indicated by Collapse {N: M}, in which N is the branch depth (number of intermediate nodes) and M is the number of sequences within the branch. All sequences are on the PGT137 branch except for 174091, which is somatically related to PGT136.
Figure 4
Figure 4
Sequence selection for functional characterization of heavy chains from donor 39. (A) Divergence/identity analysis of 15 heavy-chain variable-domain sequences obtained from the clustering analysis of 202 sequences identified by intra-donor phylogenetic analysis. Sequences of IGHV4-39 origin are plotted as a function of sequence identity to PGT137 heavy chain and sequence divergence from inferred germline allele, with 15 selected sequences shown as black triangles and their amino-acid consensus as red triangle. (B) Percent population of 15 clusters obtained using a sequence identity cutoff of 95%. Each cluster is indicated by its representative sequence. “Frequency” refers to the total number of sequences observed for each cluster. (C) Protein sequences of 15 cluster representatives and their amino-acid consensus. Sequences are aligned to the inferred germline gene, IGHV4-39*07. Framework regions (FR) and complementarity-determining regions (CDRs) are based on Kabat nomenclature. Amino acids mutated from the germline gene are shown in red.
Figure 5
Figure 5
Divergence/identity analysis of heavy-chain neutralization. (A) The expressed heavy-chain sequences color-coded based on the neutralization potency of reconstituted antibodies, with IC50 <1.0 for both viruses shown in red (effective neutralizers), IC50 >50.0 for both viruses in black (non-neutralizers), and other cases in gray (weak neutralizers). The amino-acid consensus, when reconstituted with PGT137 light chain, neutralized both viruses with an IC50 <1.0 and is shown as a red hollow star. (B) The three largest clusters are displayed on the enlarged divergence/identity plot, with 136, 46, and 7 members, respectively.
Figure 6
Figure 6
Repertoire of donor 39 light-chain variable-domain sequences of IGKV3-15 origin determined by 454 pyrosequencing. After processed by a standard bioinformatics pipeline, 87,229 full-length, light-chain variable-domain sequences from IGKV3-15 germline family are plotted as a function of sequence identity to the light-chain variable-domain of PGT135 (A), PGT136 (B), and PGT137 (C) and of sequence divergence from inferred IGKV3-15 germline allele. Color coding indicates the number of sequences.
Figure 7
Figure 7
Evolutionary similarity of PGT135–137 to donor 39 light-chain variable-domain sequences. Germline-rooted maximum-likelihood tree of PGT135–137 and 72 sequences identified by the iterative intra-donor phylogenetic analysis of donor 39 light-chain variable-domain sequences determined by 454 pyrosequencing. The iterative intra-donor phylogenetic analysis was based on an implementation of neighbor-joining (NJ) method. Collapsed branches are indicated by Collapse {N: M}, as in Figure 3. Sequences that are immediately outside the maximum-likelihood-defined PGT135–137 subtree are circled in blue dashed-line.
Figure 8
Figure 8
Sequence selection for functional characterization of light chains from donor 39. (A) Divergence/identity analysis of 10 light-chain variable-domain sequences obtained from the clustering analysis of 72 sequences identified by intra-donor phylogenetic analysis. Sequences of IGKV3-15*01 origin are plotted as a function of sequence identity to PGT137 light chain and sequence divergence from inferred germline allele, with 10 selected sequences shown as black triangles. (B) Percent population of 10 clusters obtained using a sequence identity cutoff of 95%. Each cluster is indicated by its representative sequence. (C) Protein sequences of 10 cluster representatives. Sequences are aligned to the inferred germline gene, IGKV3-15*01. Framework regions (FR) and complementarity-determining regions (CDRs) are based on Kabat nomenclature. Amino acids mutated from the germline gene are shown in red.
Figure 9
Figure 9
Divergence/identity analysis of light-chain neutralization. (A) The expressed light-chain sequences color-coded based on the neutralization potency of reconstituted antibodies, with IC50 <1.0 for both viruses shown in red (effective neutralizers), IC50 >50.0 for both viruses in black (non-neutralizers), and other cases in gray (weak neutralizers). (B) The three largest clusters are displayed on the enlarged divergence/identity plot, with 45, 6, and 4 members, respectively.

References

    1. Altschul S. F., Madden T. L., Schaffer A. A., Zhang J. H., Zhang Z., Miller W., Lipman D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–340210.1093/nar/25.17.3389 - DOI - PMC - PubMed
    1. Archer J., Braverman M. S., Taillon B. E., Desany B., James I., Harrigan P. R., Lewis M., Robertson D. L. (2009). Detection of low-frequency pretherapy chemokine (CXC motif) receptor 4 (CXCR4)-using HIV-1 with ultra-deep pyrosequencing. AIDS 23, 1209–121810.1097/QAD.0b013e32832b4399 - DOI - PMC - PubMed
    1. Balazs A. B., Chen J., Hong C. M., Rao D. S., Yang L., Baltimore D. (2011). Antibody-based protection against HIV infection by vectored immunoprophylaxis. Nature 481, 81–8410.1038/nature10660 - DOI - PMC - PubMed
    1. Barouch D. H., Nabel G. J. (2005). Adenovirus vector-based vaccines for human immunodeficiency virus type 1. Hum. Gene Ther. 16, 149–15610.1089/hum.2005.16.149 - DOI - PubMed
    1. Boyd S. D., Gaeta B. A., Jackson K. J., Fire A. Z., Marshall E. L., Merker J. D., Maniar J. M., Zhang L. N., Sahaf B., Jones C. D., Simen B. B., Hanczaruk B., Nguyen K. D., Nadeau K. C., Egholm M., Miklos D. B., Zehnder J. L., Collins A. M. (2010). Individual variation in the germline Ig gene repertoire inferred from variable region gene rearrangements. J. Immunol. 184, 6986–699210.4049/jimmunol.1000445 - DOI - PMC - PubMed

LinkOut - more resources