Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan-Dec;14(1):2031482.
doi: 10.1080/19420862.2022.2031482.

In silico proof of principle of machine learning-based antibody design at unconstrained scale

Affiliations

In silico proof of principle of machine learning-based antibody design at unconstrained scale

Rahmad Akbar et al. MAbs. 2022 Jan-Dec.

Abstract

Generative machine learning (ML) has been postulated to become a major driver in the computational design of antigen-specific monoclonal antibodies (mAb). However, efforts to confirm this hypothesis have been hindered by the infeasibility of testing arbitrarily large numbers of antibody sequences for their most critical design parameters: paratope, epitope, affinity, and developability. To address this challenge, we leveraged a lattice-based antibody-antigen binding simulation framework, which incorporates a wide range of physiological antibody-binding parameters. The simulation framework enables the computation of synthetic antibody-antigen 3D-structures, and it functions as an oracle for unrestricted prospective evaluation and benchmarking of antibody design parameters of ML-generated antibody sequences. We found that a deep generative model, trained exclusively on antibody sequence (one dimensional: 1D) data can be used to design conformational (three dimensional: 3D) epitope-specific antibodies, matching, or exceeding the training dataset in affinity and developability parameter value variety. Furthermore, we established a lower threshold of sequence diversity necessary for high-accuracy generative antibody ML and demonstrated that this lower threshold also holds on experimental real-world data. Finally, we show that transfer learning enables the generation of high-affinity antibody sequences from low-N training data. Our work establishes a priori feasibility and the theoretical foundation of high-throughput ML-based mAb design.

Keywords: Generative machine learning; antibody design; epitope; paratope.

PubMed Disclaimer

Conflict of interest statement

E.M. declares holding shares in aiNET GmbH. V.G. declares advisory board positions in aiNET GmbH and Enpicom B.V. V.G. is a consultant for Adaptyv Biosystems, Specifica Inc, and Roche/Genentech.

Figures

Figure 1.
Figure 1.
In silico proof of principle of ML-based antibody design at unconstrained scale. We leveraged large synthetic ground-truth antibody sequence data with known paratope, epitope, and affinity to demonstrate in a proof-of-principle the (a,b) unconstrained deep generative learning-based generation of native-like antibody sequences. (c) An in silico oracle (Absolut!) enables the prospective evaluation of conformational (3D) affinity, paratope-epitope pairs, and developability of in silico generated antibody sequences. We also leveraged an experimentally validated oracle to test antibody design conclusions gained based on the synthetic antibody sequence data. (d) Finally, we show that transfer learning increases generation quality of low-N-based ML models.
Figure 2.
Figure 2.
Computational workflow for ML-based antibody design and evaluation thereof. (a) Generation of in silico training datasets with binding paratope, epitope, and affinity annotation. Briefly, PDB (Protein Data Bank) 3D antigen structures were obtained from the Antibody Database and native antibody sequences (CDR-H3) were obtained from Greiff and colleagues. CDR-H3 sequences were annotated with their corresponding affinity and epitope to each antigen using the Absolut! software suite. In addition, six widely used developability parameters were calculated for each CDR-H3 sequence (see Table 2). (b) Training a generative model on high-affinity CDR-H3 sequences to each antigen. Native linear 1D antigen-specific CDR-H3 sequences were used as input to train sequence-based RNN-LSTM generative models. Of note, the RNN-LSTM architecture did not receive any explicit 3D information on the paratope, epitope, affinity, nor the developability of a given sequence. (c) Large-scale in silico CDR-H3 sequence generation and binding validation. Following training, the deep models were used to generate new CDR-H3 sequences, which were then evaluated (prospectively tested) for their antigen-specificity (affinity, paratope, epitope) using Absolut! (simulation) and annotated with developability-associated parameters. (d) Comparison of training and generated affinities. The affinity of training antigen-specific CDR-H3 sequences (nseq = 70,000, blue) to 10 different 3D antigens obtained from PDB (see Table 1). The affinity of the 70,000 generated CDR-H3 sequences from the 10 RNN-LSTM models is shown in yellow. (e) Comparison of training and generated sequences for paratope-epitope recognition. Absolut! was used to compute the affinity and paratope fold/epitope of the training data (see Methods: Generation of lattice-based antibody-antigen binding structures using Absolut!). For readability, paratope and epitope statistics in the training (native) and generated datasets are visualized at larger proportions for the antigen 1OB1. (f) Pearson correlation (range: 0.864–0.907) of CDR-H3 sequence composition between training (“native”) and generated datasets quantifying the preservation of long-range dependencies. CDR-H3 sequence composition was measured using gapped k-mers where the size of the k-mer was 1 and the size of the maximum gap varied between 1 and 5. (g) CDR-H3 sequence similarity (Levenshtein distance, LD) distribution determined among training (native) and generated CDR-H3 sequence datasets (see Supplementary Fig. S4 for the LD distribution of CDR-H3 sequences with the native and generated set, respectively). (h) CDR-H3 sequence novelty (overlap) defined as CDR-H3antigen_x∩CDR-H3antigen_y/70,000, where x and y are the 10 antigens listed in Table 1) of CDR-H3 sequences (median overlap <0.5% → novelty: >99.5%) between both “native and generated” and “generated and generated” datasets across all antigen combinations. (i) Developability parameter distribution between training and generated CDR-H3 sequences overlaps substantially (see Table 2 for a description of the developability parameters used).
Figure 3.
Figure 3.
Exhaustive generation reveals better antibodies than are present in the training dataset. (a) To examine the ability of the RNN-LSTM model to generate CDR-H3 sequences beyond the native realms (in terms of quantity and affinity), we first binned the native high-affinity antigen-specific training CDR-H3 sequences into four affinity classes: hyperbinder (affinity >max native), ultimate binder (max native>–1/3), penultimate binder (1/3–2/3), and binder (2/3–min native). Following binning, we used deep generative models to generate 700 K new sequences, devised 10 cutoffs in the increment of 70 K (70 K[native sized], 140 K … 700 K[large]), subsampled 10 times (from the 700 K generated sequences) and counted the number of novel sequences in each cutoff. Native (training) and generated sequences are shown in blue and yellow; error bars are shown for subsampled sequences. We found that, for all affinity classes, the number of unique sequences in each class increases as a function of the total number of generated sequences. In addition, we found sequences that possess a higher affinity than the native-training sequences (called hyperbinders) with affinity improvements over native CDR-H3 sequences ranging between 0.04–4.4% [depending on the antigen, percentages were calculated relative to the minimum affinity per antigen]. (b) To examine the diversity and preferences of developability combinations, we annotated each CDR-H3 sequence with a binary developability encoding. Briefly, we binned each developability parameter in two bins (low = min–median and high = median–max) and annotated each sequence with a composite binary encoding from all six developability parameters (i.e., 0_0_0_0_0_1 indicates that the sequence has a low charge, low molecular weight, low gravy index, low instability index, low affinity to MHCII and high affinity to MHC). We found that the generated CDR-H3 sequences yielded larger ranges of developability combinations in native-sized generation (nseq = 70,000) and large generation (nseq = 7x105). Error bars indicate the standard deviation for the subsampling.
Figure 4.
Figure 4.
Generation quality of antibody sequences depends on the size of the training dataset and transfer learning enables the generation of higher-affinity CDR-H3 sequences from lower-sized training datasets. (a) To examine the impact of sample size on the resulting binding affinity and epitope (see Supplementary Fig. S8) of generated CDR-H3 sequences, we created smaller training datasets (nseq,subsample = 700; 7,000; 10,000; 20,000; 30 000; 40,000; 50,000; 60,000, and nreplicates = 5) from the full antigen-specific CDR-H3 sequences (nseq,training = 70,000), trained deep generative models on the subsets and compared the binding affinity and epitope against affinity and epitope from models trained on the full data and the native affinity and epitope (see Supplementary Fig. S8 for correlations of CDR-H3 epitope occupancy). We found that models trained on the larger dataset sizes (>2x104), but not the smaller subsets (in the order of 103 or 102), sufficiently replicate the distribution of binding affinity and epitope CDR-H3 sequences. (b) To investigate whether transfer learning may be used to improve the affinity and epitope (see also Supplementary Fig. S9–Supplementary Fig. S13) binding of CDR-H3 sequences generated by models trained on smaller-sized datasets, we constructed a transfer architecture wherein embedding and RNN-LSTM layers from a “data-rich” model (high N, nseq, training = 70,000) were stacked atop of a fresh dense layer and training the resulting ‘transfer’ model on lower-sized datasets (data-poor; low N, nseq, training = 700/7,000). Two types of transfer experiments were performed: a within-antigen transfer experiment (e.g., between a data-rich model of an antigen V and data-poor models of the same antigen V) and a between-antigens (across antigens) transfer experiment (e.g., between data-rich model of an antigen V and data-poor model of antigen G). We used Kolmogorov–Smirnov distance (KSD, range: 0 for identical distribution, increasing value for increasing dissimilarity between distributions) to quantify the similarity between affinity distributions of CDR-H3 sequences generated by the models with transfer learning (+T) and without transfer learning (-T). Smaller KSD values indicate that the compared affinity distributions are similar and a larger value signifies dissimilarity of affinity distributions. For within transfer experiments, we found marked reductions of KSD values (against the native population) in all antigens signifying the transferability of general antibody-antigen binding features within antigens. For across-antigens transfer experiments, 7 out of 10 antigens showed reductions in KSD values in at least one transfer scenario (nseq, training = 700 or 7,000, Figure 4b) suggesting the transferability of antibody-antigen binding features across antigens.
Figure 5.
Figure 5.
RNN-LSTM model trained on experimentally validated binders (not synthetic sequences) generated native-experimental-like binders. (a) To validate that our RNN-LSTM model cannot only reproduce properties of native-like synthetic sequences of binders but also experimentally determined binders, we trained the model with varying numbers (700–Max; max for binders ~11 K and max for non-binders ~27 K) of binders and non-binders obtained from recently published experimental data against human epidermal growth factor 2 (HER2), generated 7 × 104 sequences and scored the sequences with the Mason et al. CNN classifier (the CNN classifier outputs a HER2 binding probability value between 0–1). Subsequently we used the CNN as an experimentally validated oracle to create datasets of binders (Pbind>0.7 or Pbind>0.5) and non-binders (Pbind≤0.3 or Pbind≤0.5) for our 7 × 106 mouse sequences, trained our model with the oracled datasets (700–Max; max for binders and non-binders is 7 × 104). We subsampled five times for the lower sized datasets (700 and 7,000). Finally we compared the proportion of predicted HER2 binders and non-binders across models trained on experimental data and models trained on oracled data. (b) We found good correspondence between the experimental and oracled datasets in terms of fraction of correctly predicted sequence (binders, non-binders). For binders, RNN-LSTM models trained on the smallest training datasets yielded the least fraction of correct prediction (Exp.: 0.25; Orac. 0.16) and models trained on the largest training datasets yielded the maximum fraction of correct prediction (Exp.: 0.68; Orac.: 0.71). For non-binders, we found that already at the smallest datasets the models were able to yield non-binders both for experimental and oracled datasets. Specifically, the fraction of correct prediction for the smallest non-binder datasets were 0.95, 0.88, and 0.91 for the experimental and oracled (Pbind≤0.3 and Pbind≤0.5) category respectively; and at the largest non-binder datasets, the fractions of correct prediction were 0.84, 0.92, and 0.94 for the experimental and oracled category respectively. Distributions of amino acids per position are summarized in Supplementary Fig. S15 and distributions of predicted binding probability of the here shown data are in Supplementary Fig. S14. Baseline HER2 binding probability distributions of human and mouse CDR-H3 sequences are shown in Supplementary Fig. S16.

References

    1. Lu R-M, Hwang Y-C, Liu I-J, Lee -C-C, Tsai H-Z, Li H-J, Wu H-C.. Development of therapeutic antibodies for the treatment of diseases. J Biomed Sci. 2020;27(1):1. doi:10.1186/s12929-019-0592-z. - DOI - PMC - PubMed
    1. Wang C, Li W, Drabek D, Okba NMA, van Haperen R, Osterhaus ADME, van Kuppeveld FJM, Haagmans BL, Grosveld F, Bosch B-J.. A human monoclonal antibody blocking SARS-CoV-2 infection. Nat Commun. 2020;11(1):2251. doi:10.1038/s41467-020-16256-y. - DOI - PMC - PubMed
    1. Marasco WA, Sui J. The growth and potential of human antiviral monoclonal antibody therapeutics. Nat Biotechnol. 2007;25(12):1421–18. doi:10.1038/nbt1363. - DOI - PMC - PubMed
    1. Liu C, Zhou Q, Li Y, Garner LV, Watkins SP, Carter LJ, Smoot J, Gregg AC, Daniels AD, Jervey S, et al. Research and development on therapeutic agents and vaccines for COVID-19 and related human coronavirus diseases. ACS Cent Sci. 2020;6(3):315–31. doi:10.1021/acscentsci.0c00272. - DOI - PMC - PubMed
    1. Laustsen AH, Bohn M-F, Ljungars A. The challenges with developing therapeutic monoclonal antibodies for pandemic application. Expert Opin Drug Discov. 2022;17(1): 5–8. - PubMed

Publication types

LinkOut - more resources