Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov 18;37(22):4041-4047.
doi: 10.1093/bioinformatics/btab434.

Humanization of antibodies using a machine learning approach on large-scale repertoire data

Affiliations

Humanization of antibodies using a machine learning approach on large-scale repertoire data

Claire Marks et al. Bioinformatics. .

Abstract

Motivation: Monoclonal antibody (mAb) therapeutics are often produced from non-human sources (typically murine), and can therefore generate immunogenic responses in humans. Humanization procedures aim to produce antibody therapeutics that do not elicit an immune response and are safe for human use, without impacting efficacy. Humanization is normally carried out in a largely trial-and-error experimental process. We have built machine learning classifiers that can discriminate between human and non-human antibody variable domain sequences using the large amount of repertoire data now available.

Results: Our classifiers consistently outperform the current best-in-class model for distinguishing human from murine sequences, and our output scores exhibit a negative relationship with the experimental immunogenicity of existing antibody therapeutics. We used our classifiers to develop a novel, computational humanization tool, Hu-mAb, that suggests mutations to an input sequence to reduce its immunogenicity. For a set of therapeutic antibodies with known precursor sequences, the mutations suggested by Hu-mAb show substantial overlap with those deduced experimentally. Hu-mAb is therefore an effective replacement for trial-and-error humanization experiments, producing similar results in a fraction of the time.

Availability and implementation: Hu-mAb (humanness scoring and humanization) is freely available to use at opig.stats.ox.ac.uk/webapps/humab.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Percentage of antibody therapeutics classified as human by our RF models, split by their origin: Human (176 sequences), Humanized (214 sequences), Chi/Humanized (34 sequences), Chimeric (43 sequences) and Mouse (14 sequences). Chi/Humanized are sequences which are part humanized and part chimeric. Therapeutics were classified based on their VH and VL sequences separately, as well as combined (to be classified as human, both VH and VL scores had to be above the respective YJS threshold). As the humanness of the therapeutics decreases (left to right), the proportion classified as human also decreases
Fig. 2.
Fig. 2.
Relationship between the humanness scores produced by our RF models and experimentally determined immunogenicity. Therapeutics were split into three categories according to the minimum humanness score of the VH and VL chains: positive with a score above 0.9 [‘Positive (high score, score > 0.9)’] (85 sequences), above the YJS threshold for the relevant RF model but with a score ≤ 0.9 [‘Positive (score ≤ 0.9)’] (57 sequences) and below the YJS threshold (‘Negative’) (75 sequences). Both the VH and VL sequences have to be above the threshold to be classed as ‘Positive’. The immunogenicity of a therapeutic is also represented by three levels: over 50% of patients develop ADAs (orange, solid), 10–50% of patients develop ADAs (yellow, dotted) and under 10% of patients develop ADAs (blue, striped). Therapeutic sequences classified as human by our model tend to have low immunogenicity levels, while sequences classified as not human are more immunogenic
Fig. 3.
Fig. 3.
The Hu-mAb humanization procedure demonstrated using the heavy chain sequence of the therapeutic Campath. The humanized sequence produced experimentally is shown at the bottom of the figure (conserved residues in yellow, mutated residues in orange). Starting with the unhumanized precursor sequence (top), Hu-mAb makes every possible mutation to the framework residues (grey) and selects the one that produces the largest increase in humanness score. CDR residues (dark blue) are not mutated to preserve binding. This procedure is performed iteratively until the humanness score reaches a given threshold. Mutations suggested by Hu-mAb are coloured depending on whether they are the same (green), similar (blue) or different (red) to mutations made experimentally. In this case, Hu-mAb suggested 16 mutations (compared to 39 from the experiment), of which 14 were the same or similar to those derived experimentally
Fig. 4.
Fig. 4.
Feature importance of the VH V3 RF model and its top 10 features. The x-axis consists of the residue positions in a sequential manner (left to right, IMGT numbering scheme). The inset table shows the top 10 features and the percentage frequency of the relevant amino acid type seen within the respective sets of sequences (V3 and negative, or non-human). The most important features likely determine the humanness of the sequence and are mainly located in the framework regions

References

    1. Chirino A.J. et al. (2004) Minimizingthe immunogenicity of protein therapeutics. Drug Discov. Today, 9, 82–90. - PubMed
    1. Choi Y. et al. (2015) Antibody humanization by structure-based computational protein design. mAbs, 7, 1045–1057. - PMC - PubMed
    1. Clavero-Alvarez A. et al. (2018) Humanization of antibodies using a statistical inference approach. Sci. Rep., 8, 1–11. - PMC - PubMed
    1. Dunbar J., Deane C.M. (2016) ANARCI: antigen receptor numbering and receptor classification. Bioinformatics, 32, 298–300. - PMC - PubMed
    1. Ecker D.M. et al. (2015) The therapeutic monoclonal antibody market. mAbs, 7, 9–14. - PMC - PubMed

Publication types

Substances