. 2021 Nov 18;37(22):4041-4047.

doi: 10.1093/bioinformatics/btab434.

Humanization of antibodies using a machine learning approach on large-scale repertoire data

Claire Marks¹, Alissa M Hummer¹, Mark Chin¹, Charlotte M Deane¹

Affiliations

PMID: 34110413
PMCID: PMC8760955
DOI: 10.1093/bioinformatics/btab434

Humanization of antibodies using a machine learning approach on large-scale repertoire data

Claire Marks et al. Bioinformatics. 2021.

. 2021 Nov 18;37(22):4041-4047.

doi: 10.1093/bioinformatics/btab434.

Authors

Claire Marks¹, Alissa M Hummer¹, Mark Chin¹, Charlotte M Deane¹

Affiliation

¹ Department of Statistics, University of Oxford, Oxford OX1 3LB, UK.

PMID: 34110413
PMCID: PMC8760955
DOI: 10.1093/bioinformatics/btab434

Abstract

Motivation: Monoclonal antibody (mAb) therapeutics are often produced from non-human sources (typically murine), and can therefore generate immunogenic responses in humans. Humanization procedures aim to produce antibody therapeutics that do not elicit an immune response and are safe for human use, without impacting efficacy. Humanization is normally carried out in a largely trial-and-error experimental process. We have built machine learning classifiers that can discriminate between human and non-human antibody variable domain sequences using the large amount of repertoire data now available.

Results: Our classifiers consistently outperform the current best-in-class model for distinguishing human from murine sequences, and our output scores exhibit a negative relationship with the experimental immunogenicity of existing antibody therapeutics. We used our classifiers to develop a novel, computational humanization tool, Hu-mAb, that suggests mutations to an input sequence to reduce its immunogenicity. For a set of therapeutic antibodies with known precursor sequences, the mutations suggested by Hu-mAb show substantial overlap with those deduced experimentally. Hu-mAb is therefore an effective replacement for trial-and-error humanization experiments, producing similar results in a fraction of the time.

Availability and implementation: Hu-mAb (humanness scoring and humanization) is freely available to use at opig.stats.ox.ac.uk/webapps/humab.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

**Fig. 1.**
Percentage of antibody therapeutics classified as human by our RF models, split by their origin: Human (176 sequences), Humanized (214 sequences), Chi/Humanized (34 sequences), Chimeric (43 sequences) and Mouse (14 sequences). Chi/Humanized are sequences which are part humanized and part chimeric. Therapeutics were classified based on their VH and VL sequences separately, as well as combined (to be classified as human, both VH and VL scores had to be above the respective YJS threshold). As the humanness of the therapeutics decreases (left to right), the proportion classified as human also decreases

**Fig. 2.**
Relationship between the humanness scores produced by our RF models and experimentally determined immunogenicity. Therapeutics were split into three categories according to the minimum humanness score of the VH and VL chains: positive with a score above 0.9 [‘Positive (high score, score > 0.9)’] (85 sequences), above the YJS threshold for the relevant RF model but with a score ≤ 0.9 [‘Positive (score ≤ 0.9)’] (57 sequences) and below the YJS threshold (‘Negative’) (75 sequences). Both the VH and VL sequences have to be above the threshold to be classed as ‘Positive’. The immunogenicity of a therapeutic is also represented by three levels: over 50% of patients develop ADAs (orange, solid), 10–50% of patients develop ADAs (yellow, dotted) and under 10% of patients develop ADAs (blue, striped). Therapeutic sequences classified as human by our model tend to have low immunogenicity levels, while sequences classified as not human are more immunogenic

**Fig. 3.**
The Hu-mAb humanization procedure demonstrated using the heavy chain sequence of the therapeutic Campath. The humanized sequence produced experimentally is shown at the bottom of the figure (conserved residues in yellow, mutated residues in orange). Starting with the unhumanized precursor sequence (top), Hu-mAb makes every possible mutation to the framework residues (grey) and selects the one that produces the largest increase in humanness score. CDR residues (dark blue) are not mutated to preserve binding. This procedure is performed iteratively until the humanness score reaches a given threshold. Mutations suggested by Hu-mAb are coloured depending on whether they are the same (green), similar (blue) or different (red) to mutations made experimentally. In this case, Hu-mAb suggested 16 mutations (compared to 39 from the experiment), of which 14 were the same or similar to those derived experimentally

**Fig. 4.**
Feature importance of the VH V3 RF model and its top 10 features. The x-axis consists of the residue positions in a sequential manner (left to right, IMGT numbering scheme). The inset table shows the top 10 features and the percentage frequency of the relevant amino acid type seen within the respective sets of sequences (V3 and negative, or non-human). The most important features likely determine the humanness of the sequence and are mainly located in the framework regions

See this image and copyright information in PMC

References

1. Chirino A.J. et al. (2004) Minimizingthe immunogenicity of protein therapeutics. Drug Discov. Today, 9, 82–90. - PubMed
1. Choi Y. et al. (2015) Antibody humanization by structure-based computational protein design. mAbs, 7, 1045–1057. - PMC - PubMed
1. Clavero-Alvarez A. et al. (2018) Humanization of antibodies using a statistical inference approach. Sci. Rep., 8, 1–11. - PMC - PubMed
1. Dunbar J., Deane C.M. (2016) ANARCI: antigen receptor numbering and receptor classification. Bioinformatics, 32, 298–300. - PMC - PubMed
1. Ecker D.M. et al. (2015) The therapeutic monoclonal antibody market. mAbs, 7, 9–14. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

MRC_/Medical Research Council/United Kingdom

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Humanization of antibodies using a machine learning approach on large-scale repertoire data

Affiliation

Humanization of antibodies using a machine learning approach on large-scale repertoire data

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources