Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 22;147(3):2747-2755.
doi: 10.1021/jacs.4c15551. Epub 2025 Jan 10.

Biochemical and Computational Characterization of Haloalkane Dehalogenase Variants Designed by Generative AI: Accelerating the SN2 Step

Affiliations

Biochemical and Computational Characterization of Haloalkane Dehalogenase Variants Designed by Generative AI: Accelerating the SN2 Step

Natalia Gelfand et al. J Am Chem Soc. .

Abstract

Generative artificial intelligence (AI) models trained on natural protein sequences have been used to design functional enzymes. However, their ability to predict individual reaction steps in enzyme catalysis remains unclear, limiting the potential use of sequence information for enzyme engineering. In this study, we demonstrated that sequence information can predict the rate of the SN2 step of a haloalkane dehalogenase using a generative maximum-entropy (MaxEnt) model. We then designed lower-order protein variants of haloalkane dehalogenase using the model. Kinetic measurements confirmed the successful design of protein variants that enhance catalytic activity, above that of the wild type, in the overall reaction and in particular in the SN2 step. On the simulation side, we provided molecular insights into these designs for the SN2 step using the empirical valence bond (EVB) and metadynamics simulations. The EVB calculations showed activation barriers consistent with experimental reaction rates, while examining the effect of amino acid replacements on the electrostatic effect on the activation barrier and the consequence of water penetration, as well as the extent of ground state destabilization/stabilization. Metadynamics simulations emphasize the importance of the substrate positioning in enzyme catalysis. Overall, our AI-guided approach successfully enabled the design of a variant with a faster rate for the SN2 step than the wild-type enzyme, despite haloalkane dehalogenase being extensively optimized through natural evolution.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1.
Figure 1.
DhlA-catalyzed reaction and the structure of DhlA. (A) Reaction Scheme. SN2 displacement of the chlorine atom from DCE, involving covalent bonding of DCE to the carboxylate group of D124, followed by hydrolysis, producing 2-chloroethanol. (B) Crystal structure of DhlA bound to DCE (PDB ID: 2DHC). (C) Close-up of the substrate (carbon atoms are in blue, chlorine atoms are in green) and selected active site residues, with F128 and L262 (underlined) highlighted as the mutated positions in the designs.
Figure 2.
Figure 2.
Predicting the catalytic activity of DhlA using the generative MaxEnt model. (A) Schematic representation of the MaxEnt model. The model captures both site conservation and pairwise site couplings in the MSA constructed with DhlA homologues. (B) The MaxEnt model predicts both the overall catalytic activity (kcat) and the SN2 step (k2). Lower E(S) (or higher P(S)) indicates higher activity. The kcat, k2 and statistical energies E(S) are presented in Table S1.
Figure 3.
Figure 3.
Normalized first derivative curves from differential scanning fluorimetry for DhlA protein variants. The peak maxima correspond to the melting temperatures (Tm), measured in 100 mM glycine buffer at pH 8.6.
Figure 4.
Figure 4.
Model used for fitting transient kinetic data of DhlA protein variants.
Figure 5.
Figure 5.
EVB simulations of DhlA protein variants. (A) Comparison of computed (pink) and experimental (blue) activation energies (ΔG) for the SN2 reaction. Experimental ΔG values are derived from the Eyring equation based on k2. (B−E) Representative molecular structures of native DhlA in its reactant state (B) and transition state (C), and of the L262V variant in its reactant state (D) and transition state (E). Only DCE and the side chains of key amino acids (D124, W125, F128, W175, V262) are shown. Atom color code: H in gray, C in slate blue, N in deep blue, O in red, Cl in green.
Figure 6.
Figure 6.
Metadynamics simulations of DhlA protein variants. Angle θ(O−C−Cl) as a function of distance d(O−C) in (A) wild type, (B) L262V, (C) F128L/L262V, and (D) F128L/L262I.

References

    1. Yang KK; Wu Z; Arnold FH Machine-Learning-Guided Directed Evolution for Protein Engineering. Nat. Methods 2019, 16 (8), 687–694. - PubMed
    1. Lin Z; Akin H; Rao R; Hie B; Zhu Z; Lu W; Smetanin N; Verkuil R; Kabeli O; Shmueli Y; dos Santos Costa A; Fazel-Zarandi M; Sercu T; Candido S; Rives A Evolutionary-Scale Prediction of Atomic-Level Protein Structure with a Language Model. Science 2023, 379 (6637), 1123–1130. - PubMed
    1. Hopf TA; Ingraham JB; Poelwijk FJ; Schärfe CPI; Springer M; Sander C; Marks DS Mutation Effects Predicted from Sequence Co-Variation. Nat. Biotechnol 2017, 35, 128–135. - PMC - PubMed
    1. Bond-Taylor S; Leach A; Long Y; Willcocks CG Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models. IEEE Trans. Pattern Anal. Mach. Intell 2022, 44 (11), 7327–7347. - PubMed
    1. Xie WJ; Asadi M; Warshel A Enhancing Computational Enzyme Design by a Maximum Entropy Strategy. Proc. Natl. Acad. Sci. U.S.A 2022, 119 (7), No. e2122355119. - PMC - PubMed

LinkOut - more resources