Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug 16;145(32):17656-17664.
doi: 10.1021/jacs.3c03639. Epub 2023 Aug 2.

Using Data Science for Mechanistic Insights and Selectivity Predictions in a Non-Natural Biocatalytic Reaction

Affiliations

Using Data Science for Mechanistic Insights and Selectivity Predictions in a Non-Natural Biocatalytic Reaction

Hanna D Clements et al. J Am Chem Soc. .

Abstract

The study of non-natural biocatalytic transformations relies heavily on empirical methods, such as directed evolution, for identifying improved variants. Although exceptionally effective, this approach provides limited insight into the molecular mechanisms behind the transformations and necessitates multiple protein engineering campaigns for new reactants. To address this limitation, we disclose a strategy to explore the biocatalytic reaction space and garner insight into the molecular mechanisms driving enzymatic transformations. Specifically, we explored the selectivity of an "ene"-reductase, GluER-T36A, to create a data-driven toolset that explores reaction space and rationalizes the observed and predicted selectivities of substrate/mutant combinations. The resultant statistical models related structural features of the enzyme and substrate to selectivity and were used to effectively predict selectivity in reactions with out-of-sample substrates and mutants. Our approach provided a deeper understanding of enantioinduction by GluER-T36A and holds the potential to enhance the virtual screening of enzyme mutants.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Workflow to develop statistical models of biocatalytic reaction performance. (A) Two complementary approaches, aMD and IFD, were used to generate enzyme conformers from a GluER-T36A crystal structure (PDB ID: 6MYW) after introducing the desired mutation in silico. (B) Enzyme features were quantified using a residue-based approach. Geometric descriptors include the length, width, and backbone angles of each residue conformer and the fluctuations of these measurements. Dynamic descriptors were measured by overlaying a residue conformational ensemble, encapsulating it in a fictitious surface, and measuring the resulting surface area and volume. Ligands were subjected to both a geometric analysis and DFT calculations to acquire electronic descriptors, including the natural bond orbital (NBO) charges of atoms indicated by a yellow sphere. (C) Descriptors for each enzyme/ligand were regressed against the experimentally determined selectivities, resulting in statistical models that enabled mechanistic insights and predicted the outcomes of out-of-sample substrates and mutants.
Figure 2.
Figure 2.
aMD and IFD statistical models with ligand descriptors (red) and enzyme descriptors (blue). (A) The aMD model had a training and validation R2 of 0.82 and 0.73, respectively, a leave-one-out (LOO) R2 of 0.70, and a 4-fold R2 of 0.67. (B) The IFD model had a training and validation R2 of 0.83 and 0.57, respectively, LOO R2 of 0.73, and a 4-fold R2 of 0.70. (C). Enantioselectivities of reactions with 5a and 6a to form 5b and 6b, respectively, predicted from the aMD and IFD models. aPredicted from aMD model 2, Figure S3.
Figure 3.
Figure 3.
Statistical model of HAT side product formation. (A) Regressing experimental ratios of cyclization: HAT with the IFD descriptor set resulted in the best statistical model (training and validation R2 of 0.82 and 0.70, respectively, LOO R2 of 0.76, and a 4-fold R2 of 0.70). The model had three ligand descriptors (red) and one enzyme descriptor (blue). (B) HAT model predictions on substrate 6a.
Figure 4.
Figure 4.
Mechanistic interpretation of descriptors from initial models. (A) The aMD conformers demonstrated that when aromatic residues 100 and 177 were closely associated, interactions between residues 66 and 100 were precluded, inducing residue 66 flexibility and higher selectivity. The IFD conformational ensembles (right) corroborated that the flexibility of residue 66 is necessary for substrate binding. (B) The term Residue 172Sterimol L,max from the aMD model indicated that extended configurations of H172 facilitated selectivity. Examination of enzyme conformers where this term was large (GluER-T36A-F269L = 6.7 Å) showed H172 to be extended (blue) and revealed an open binding pocket (yellow sphere); this binding pocket was occluded in structures where values of this parameter were small (GluER-T36A-Y177W = 5.2 Å, red).
Figure 5.
Figure 5.
Hypothesis-driven parameter development. (A) Cluster DSAs and (B) IRD were measured to explicitly describe the interactions between residues 66/100/177 and 172/175/177. (C) The overall residue flexibility was measured by computing the RMSD of residue backbone and side-chain atoms.
Figure 6.
Figure 6.
Updated model enabled virtual screening and prediction of the selectivity of new GluER-T36A mutants. (A) Similar to the initial models, the updated statistical model has two ligand descriptors (red), several residue-based enzyme descriptors (blue), and one IRD parameter (gray) that measures the distance between residues 100 and 177. (B) Predicted enantioselectivities of 2a and 5a with untested GluER-T36A variants; the range for the predictions was computed at a 99% confidence interval using bootstrap subsampling. aTraining and validation set statistics were computed with a 70:30 split, as further described in the Supporting Information.
Figure 7.
Figure 7.
GluER-T36A-Y343C is structurally unique. The aMD conformational ensemble of GluER-T36A-Y343C (red) displays disorder in the 269-loop region compared to GluER-T36A (light blue). Other variants such as GluER-T36A-Y343M (dark blue) and GluER-T36A-F269C (gray) maintain structures similar to GluER-T36A.
Scheme 1.
Scheme 1.
Enantioselective Cyclization Catalyzed by GluER-T36A and Mutants

References

    1. Huffman MA; Fryszkowska A; Alvizo O; Borra-Garske M; Campos KR; Canada KA; Devine PN; Duan D; Forstater JH; Grosser ST; Halsey HM; Hughes GJ; Jo J; Joyce LA; Kolev JN; Liang J; Maloney KM; Mann BF; Marshall NM; McLaughlin M; Moore JC; Murphy GS; Nawrat CC; Nazor J; Novick S; Patel NR; Rodriguez-Granillo A; Robaire SA; Sherer EC; Truppo MD; Whittaker AM; Verma D; Xiao L; Xu Y; Yang H Design of an in Vitro Biocatalytic Cascade for the Manufacture of Islatravir. Science 2019, 366, 1255–1259. - PubMed
    1. Meghwanshi GK; Kaur N; Verma S; Dabi NK; Vashishtha A; Charan PD; Purohit P; Bhandari HS; Bhojak N; Kumar R Enzymes for Pharmaceutical and Therapeutic Applications. Biotechnol. Appl. Biochem 2020, 67, 586–601. - PubMed
    1. Duza MB; Mastan SA Microbial Enzymes and Their Applications—a Review. Indo Am. J. Pharm. Res 2013, 3, 651–657.
    1. Akyilmaz E; Yorganci E; Asav E Do Copper Ions Activate Tyrosinase Enzyme? A Biosensor Model for the Solution. Bioelectrochemistry 2010, 78, 155–160. - PubMed
    1. Arnold FH Directed Evolution: Bringing New Chemistry to Life. Angew. Chem., Int. Ed 2018, 57, 4143–4148. - PMC - PubMed

Publication types