Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 31;15(10):3640-3660.
doi: 10.1039/d3sc06208b. eCollection 2024 Mar 6.

A genetic optimization strategy with generality in asymmetric organocatalysis as a primary target

Affiliations

A genetic optimization strategy with generality in asymmetric organocatalysis as a primary target

Simone Gallarati et al. Chem Sci. .

Abstract

A catalyst possessing a broad substrate scope, in terms of both turnover and enantioselectivity, is sometimes called "general". Despite their great utility in asymmetric synthesis, truly general catalysts are difficult or expensive to discover via traditional high-throughput screening and are, therefore, rare. Existing computational tools accelerate the evaluation of reaction conditions from a pre-defined set of experiments to identify the most general ones, but cannot generate entirely new catalysts with enhanced substrate breadth. For these reasons, we report an inverse design strategy based on the open-source genetic algorithm NaviCatGA and on the OSCAR database of organocatalysts to simultaneously probe the catalyst and substrate scope and optimize generality as a primary target. We apply this strategy to the Pictet-Spengler condensation, for which we curate a database of 820 reactions, used to train statistical models of selectivity and activity. Starting from OSCAR, we define a combinatorial space of millions of catalyst possibilities, and perform evolutionary experiments on a diverse substrate scope that is representative of the whole chemical space of tetrahydro-β-carboline products. While privileged catalysts emerge, we show how genetic optimization can address the broader question of generality in asymmetric synthesis, extracting structure-performance relationships from the challenging areas of chemical space.

PubMed Disclaimer

Conflict of interest statement

There are no conflicts to declare.

Figures

Fig. 1
Fig. 1. (A) Reaction optimization tactics for the development of catalytic methods: traditional specificity-oriented vs. data-driven multi-substrate screening. (B) Schematic inverse design pipeline powered by NaviCatGA.
Fig. 2
Fig. 2. (A) Pictet–Spengler cyclization of tryptamine derivatives (SubA, PG = protecting group, H, or OH) and carbonyls (SubB) in the presence of chiral organocatalysts and weak acid co-catalysts. Examples of hydrogen-bond donors, acid/anion receptor catalysts, and chiral phosphoric acids are shown. ArF = 3,5-CF3-C6H3, X = O/S. (B)–(D) 2D t-SNE map of the reaction space on the basis of the concatenated MFPs of the substrates and catalysts color-coded by the experimental selectivity (ΔΔG, B), catalyst class (C), and SubB class (D). (Th)Ur = (thio)ureas, Sq = squaramides, SHBD = single-hydrogen-bond donors, CPA = chiral phosphoric acids, HBA = hydrogen-bond acceptor, RX = benzoyl bromide or acyl chloride (BzBr, AcCl), ROH = carboxylic acid (e.g., BzOH, AcOH).
Fig. 3
Fig. 3. (A) Violin plots of experimental ΔΔG values in the literature database of 820 Pictet–Spengler reactions for six different classes of organocatalysts. The median is indicated with horizontal lines. RX = benzoyl bromide or acyl chloride (BzBr, AcCl), ROH = carboxylic acid (e.g., BzOH, AcOH), HBA = hydrogen-bond acceptor. (B) Tabulated median ΔΔG values for different catalyst–substrate combinations from the literature database. (C) Tabulated number of reactions reported for different catalyst–substrate combinations from the literature database.
Fig. 4
Fig. 4. (A) General mechanism for the Pictet–Spengler reaction via anion-binding catalysis. (Thio)urea catalysts (X = O/S) with carboxylic acid co-catalysts are shown as an example. (B) The reactions used to construct molecular volcano plots (SRS) are plotted on the t-SNE map from Fig. 2, colored according to the nature of the organocatalyst. (C) Molecular volcano plots based on the C2 and C3 addition mechanism. The shaded areas denote the 95% confidence interval based on the Linear Free Energy Scaling Relationships. Computations were performed at the PCM(toluene)/M06-2X-D3/Def2-TZVP//M06-2X-D3/Def2-SVP level of theory. (D) Distribution of descriptor values and their location on the volcano plot.
Fig. 5
Fig. 5. XGBoost models predicting the (A) descriptor variable [ΔGRRS(2)] of the TOF molecular volcano plots, computed at the PCM(toluene)/M06-2X-D3/Def2-TZVP//M06-2X-D3/Def2-SVP level, and (B) the experimentally measured enantioselectivity (expressed as ΔΔG) of the Pictet–Spengler reactions from the literature. Predictions are obtained by averaging those from a cross-validation scheme with 100 different random 90/10 train/test splits (633/70 for A, 738/82 for B). The error bars are obtained from the standard deviations from the 100 different train/test splits.
Fig. 6
Fig. 6. (A) 2D t-SNE map of the substrate scope on the basis of the concatenated MFPs of SubA and SubB. Blue squares indicate organocatalytic reactions, green squares reactions reported in Reaxys®, red triangles the Generality Probing Set (GPS) from this work. (B) Examples of reactions found in the GPS.
Fig. 7
Fig. 7. Box-and-whisker charts showing the evolution of ΔΔG and ΔGRRS(2) of the top individual in the CPA population for selected generations (i.e., when the identity of the best-performing catalyst changes). Each datapoint corresponds to a reaction in the GPS, the yellow diamond indicates reaction 11 (shown in the top left). Outliers and far outliers are indicated with filled circles and squares, respectively. In (A), ΔΔG and ΔGRRS(2) of reaction 11 are optimized, whereas in (B) the median ΔΔG and ΔGRRS(2) of all reactions in the GPS are optimized.
Fig. 8
Fig. 8. (Left) Evolution of ΔΔG and ΔGRRS(2) of the top individual in the HBD population over 50 generations. The solid lines indicate the median across the GPS, and the shaded areas represent the upper and lower values. Selected catalysts are shown, with different colored spheres representing different R1–3 substituents. (Right) Box-and-whisker chart of ΔΔG and ΔGRRS(2) for selected generations i.e., only when the structure of the best-performing catalyst changes. Each datapoint corresponds to a reaction in the GPS. Outliers and far outliers are indicated with filled circles and squares, respectively.
Fig. 9
Fig. 9. Median selectivity () vs. activity [ΔGRRS(2)med] scatter plot for multi-objective optimization on the HBD scope, color-coded by catalyst generation. The volcano peak (maximum activity) corresponds to ΔGRRS(2) = −9.0 kcal mol−1. The dashed lines show the connections for the set of “noninferior” solutions in the objective space (Pareto optimal solutions). The gray diamond represents the top candidate from the single-objective optimization experiment (SOO, generation 37).
Fig. 10
Fig. 10. Calculated ee and log TOF values from the predicted ΔΔG and ΔGRRS(2), respectively. Results are shown for selected catalyst generations (x-axis) and reactions in the GPS (y-axis), while ee and log TOF median values (bottom) consider all 50 reactions in the GPS. SOO-37 is the top catalyst from the single-objective optimization experiment (structure shown in Fig. 9). Selected SubA and SubB combinations are shown.
Fig. 11
Fig. 11. Energetically lowest-lying TS for the deprotonation/rearomatization step (TS3) of the tetrahydro-β-carboline intermediate of GPS reaction 13 (left) and 47 (right) with the top-performing organocatalyst from generation 32. The distance between the catalyst's amide O and the indole N–H is shown. Computed and predicted enantioselectivity (expressed in terms of ΔΔG) and activity [expressed in terms of ΔGRRS(2)] values are reported.

Similar articles

Cited by

References

    1. Strassfeld D. A. Algera R. F. Wickens Z. K. Jacobsen E. N. A Case Study in Catalyst Generality: Simultaneous, Highly-Enantioselective Brønsted- and Lewis-Acid Mechanisms in Hydrogen-Bond-Donor Catalyzed Oxetane Openings. J. Am. Chem. Soc. 2021;143:9585–9594. doi: 10.1021/jacs.1c03992. - DOI - PMC - PubMed
    1. Collins K. D. Glorius F. A robustness screen for the rapid assessment of chemical reactions. Nat. Chem. 2013;5:597–601. doi: 10.1038/nchem.1669. - DOI - PubMed
    1. Brown D. G. Boström J. Analysis of Past and Present Synthetic Methodologies on Medicinal Chemistry: Where Have All the New Reactions Gone? J. Med. Chem. 2016;59:4443–4458. doi: 10.1021/acs.jmedchem.5b01409. - DOI - PubMed
    1. Brethomé A. V. Paton R. S. Fletcher S. P. Retooling Asymmetric Conjugate Additions for Sterically Demanding Substrates with an Iterative Data-Driven Approach. ACS Catal. 2019;9:7179–7187. doi: 10.1021/acscatal.9b01814. - DOI - PMC - PubMed
    1. Gao X. Kagan H. B. One-pot multi-substrate screening in asymmetric catalysis. Chirality. 1998;10:120–124. doi: 10.1002/chir.19. - DOI