Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2025 Dec;17(1):2528902.
doi: 10.1080/19420862.2025.2528902. Epub 2025 Jul 18.

Artificial intelligence-driven computational methods for antibody design and optimization

Affiliations
Review

Artificial intelligence-driven computational methods for antibody design and optimization

Luiz Felipe Vecchietti et al. MAbs. 2025 Dec.

Abstract

Antibodies play a crucial role in our immune system. Their ability to bind to and neutralize pathogens opens opportunities to develop antibodies for therapeutic and diagnostic use. Computational methods capable of designing antibodies for a target antigen can revolutionize drug discovery, reducing the time and cost required for drug development. Artificial intelligence (AI) methods have recently achieved remarkable advancements in the design of protein sequences and structures, including the ability to generate scaffolds for a given motif and binders for a specific target. These generative methods have been applied to antigen-conditioned antibody design, with experimental binding confirmed for de novo-designed antibodies. This review surveys current AI methods used in antibody development, focusing on those for antigen-conditioned antibody design. The results obtained by AI-based methodologies in antibody and protein research suggest a promising direction for generating de novo binders for various target antigens.

Keywords: Antibody design; generative artificial intelligence; machine learning; protein design; structural biology.

PubMed Disclaimer

Conflict of interest statement

No potential conflict of interest was reported by the author(s).

Figures

Diagram depicting the structure of an antibody and the interaction between antibody and antigen, where the interaction is mainly between the complementarity-determining regions of the antibody and the epitope of the antigen.
Figure 1.
Antibody structure. Antibodies, or immunoglobulins, are Y-shaped glycoproteins composed of two identical heavy and light chains, respectively. The variable regions of the heavy and light chains (VH and VL) in the antigen-binding domain (Fab) contain the hypervariable antigen-recognition regions known as complementarity-determining regions (CDRs), with three CDRHs in VH and three CDRLs in VL. Antigen-antibody complex interactions occur between the epitope and paratope of the antigen and antibody, respectively. The biological activity mediating domain (FC) is made up of the CH2 and CH3 regions of the two heavy chains, providing a binding site for endogenous FC receptors on lymphocytes to facilitate immune responses. Additionally, dyes and enzymes can be covalently linked to FC for experimental visualization.
Diagram depicting different subproblems in computational antibody design. Five diagrams are shown for antibody structure prediction, antibody representation learning, antibody sequence design, unconditioned antibody design, and antigen-conditioned antibody design, respectively.
Figure 2.
Subproblems in computational antibody design pipelines include a, antibody structure prediction, b, antibody representation learning, c, antibody sequence design, d, unconditioned antibody design, and e, antigen-conditioned antibody design. Full antibody design is illustrated in (d) and (e), but partial antibody design methods have also been developed. Here, seq-str stands for sequence and/or structure.
Three diagrams showing the classification of antibody design methods presented in this review based on their generated output type. The three classes are sequence-based antibody design, structure-based antibody design, and sequence-structure antibody co-design.
Figure 3.
Classification of antibody design methods based on the generated output type. In sequence-based design (a), only the antibody amino acid sequence is generated. In structure-based design (b), the model generates only the antibody structure, which can be followed by sequence design. In sequence-structure co-design (c), the method jointly generates the antibody sequence and structure (seq-str). Partial antibody design is illustrated here, but this classification also applies to full antibody design. The antigen enclosed in the square brackets is explicitly inputted to the model only in antigen-conditioned design. When applicable, the antigen structure input can be a free entity or docked to the antibody, depending on the method. Generally, unconditioned design models are trained on antibody datasets and output antibody candidates. In contrast, antigen-conditioned design models are trained on antigen-antibody complex datasets and output antigen-antibody complex candidates.
Two diagrams describing techniques commonly used for antigen-conditioned antibody sequence-and-structure co-design are shown. The first describes techniques based on graph neural networks, while the second describes techniques based on diffusion models.
Figure 4.
Common approaches for antigen-conditioned antibody sequence-and-structure co-design. a, a GNN-based method following the methodology in MEAN where the antigen-antibody complex is represented by a graph with three subgraphs representing each chain, i.e., antibody heavy chain VH, antibody light chain VL, and antigen. Each residue in the chains is represented by a node whose embeddings contain its position and properties (e.g., amino acid type), while each inter-residue interaction is represented by an edge. Learning is performed by internal and external context encoders via message passing within and between the subgraphs, respectively, to update the node embeddings. After the final layer, the protein sequence of the chains is predicted from the node embeddings. b, a diffusion-based method. Here, we assume that only the interaction regions of the antibody are designed, while others are kept fixed. During the forward diffusion process, diffusion-based methods gradually add noise to a representation of the antibody sequence/structure (e.g., in DiffAb, joint diffusion of residue types, Cα atom coordinates, and residue orientations) from timestep 0 to timestep T following a Markov process based on q(xt|xt−1), where xt is the representation at timestep t. During the backward diffusion process, a model with parameters θ is trained to recover the original representation by gradually denoising pθ(xt−1|xt) from timestep T to timestep 0. Note that separate neural networks can be used for the backward diffusion process of each representation in the joint diffusion.

References

    1. Marks C, Deane CM.. How repertoire data are changing antibody science. J Biol Chem. 2020;295(29):9823–26. doi: 10.1074/jbc.REV120.010181. - DOI - PMC - PubMed
    1. Mullard A. FDA approves 100th monoclonal antibody product. Nat Rev Drug Discov. 2021;20(7):491–495. doi: 10.1038/d41573-021-00079-7. - DOI - PubMed
    1. Chenoweth A, Crescioli S. Therapeutic monoclonal antibodies approved or in regulatory review. The Antibody Society. 2024. https://www.antibodysociety.org/antibody-therapeutics-product-data.
    1. Köhler G, Milstein C. Continuous cultures of fused cells secreting antibody of predefined specificity. Nature. 1975;256(5517):495–497. doi: 10.1038/256495a0. - DOI - PubMed
    1. McCafferty J, Griffiths AD, Winter G, Chiswell DJ. Phage antibodies: filamentous phage displaying antibody variable domains. Nature. 1990;348(6301):552–554. doi: 10.1038/348552a0. - DOI - PubMed

LinkOut - more resources