Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Oct 25;13(46):13782-13794.
doi: 10.1039/d2sc04251g. eCollection 2022 Nov 30.

OSCAR: an extensive repository of chemically and functionally diverse organocatalysts

Affiliations

OSCAR: an extensive repository of chemically and functionally diverse organocatalysts

Simone Gallarati et al. Chem Sci. .

Abstract

The automated construction of datasets has become increasingly relevant in computational chemistry. While transition-metal catalysis has greatly benefitted from bottom-up or top-down strategies for the curation of organometallic complexes libraries, the field of organocatalysis is mostly dominated by case-by-case studies, with a lack of transferable data-driven tools that facilitate both the exploration of a wider range of catalyst space and the optimization of reaction properties. For these reasons, we introduce OSCAR, a repository of 4000 experimentally derived organocatalysts along with their corresponding building blocks and combinatorially enriched structures. We outline the fragment-based approach used for database generation and showcase the chemical diversity, in terms of functions and molecular properties, covered in OSCAR. The structures and corresponding stereoelectronic properties are publicly available (https://archive.materialscloud.org/record/2022.106) and constitute the starting point to build generative and predictive models for organocatalyst performance.

PubMed Disclaimer

Conflict of interest statement

There are no conflicts to declare.

Figures

Fig. 1
Fig. 1. (A) Prototypical privileged chiral frameworks for asymmetric catalysis. (B) Classification of organocatalysts according to their catalytic motifs (X = O, S).
Fig. 2
Fig. 2. Graphical summary of the steps followed for the curation of OSCAR.
Fig. 3
Fig. 3. (A) Distribution histograms of catalytic motifs in the seed database and in the CSD-extracted structures. (B) Pie chart showing percentages of catalytic motifs in the seed and CSD-extracted datasets. (C) Distribution histograms of heteroatom types (X = halogens), and (D) molecular weight in the seed and in the CSD-extracted sets.
Fig. 4
Fig. 4. (A) 2D t-SNE map of OSCAR on the basis of the FCHL19 representation. Each point represents an organocatalyst, coloured by the corresponding catalytic motif. Each cluster contains catalysts with similar structure, with some examples being shown. R = alkyl group; Ar = aromatic group; PTC = phase-transfer catalyst. (B) Property map: computed (ωB97X-D/Def2-TZVP//B97-D/Def2-TZVP) nucleophilicity (Nrel) vs. electrophilicity (E-index) parameters. A zoom-in of the map is provided on the right hand side.
Fig. 5
Fig. 5. Graphical summary of the steps followed to generate the combinatorial databases OSCAR!(NHC) (top) and OSCAR!(DHBD) (bottom). X = O/S.
Fig. 6
Fig. 6. (A) Percentage buried volume vs. N-index of combinatorial NHC organocatalysts. N-index is found to scale linearly with known experimental pKa values of azolium ions (Fig. S8†). (B) HNNH dihedral angle (θ) vs. LUMO energy (ωB97X-D/Def2-TZVP//B97-D/Def2-TZVP) of dual-hydrogen-bond donor species. Good linear correlation between εLUMO and the pKa's of DHBDs has been found (Fig. S13†).
Fig. 7
Fig. 7. Distribution plots (y-axis: normalized probability density) of molecular descriptors for NHCs (A and B) and DHBDs (C and D) in the seed + CSD-extracted (red curves) and combinatorial databases (blue). X = O/S.

References

    1. Bo C. Maseras F. López N. The role of computational results databases in accelerating the discovery of catalysts. Nat. Catal. 2018;1:809–810. doi: 10.1038/s41929-018-0176-4. - DOI
    1. Nandy A. Duan C. Kulik H. J. Audacity of huge: overcoming challenges of data scarcity and data quality for machine learning in computational materials discovery. Curr. Opin. Chem. Eng. 2022;36:100778. doi: 10.1016/j.coche.2021.100778. - DOI
    1. McNally A. Prier C. K. MacMillan D. W. C. Discovery of an α-Amino C–H Arylation Reaction Using the Strategy of Accelerated Serendipity. Science. 2011;334:1114–1117. doi: 10.1126/science.1213920. - DOI - PMC - PubMed
    1. Mitsumori S. Zhang H. Ha-Yeon Cheong P. Houk K. N. Tanaka F. Barbas C. F. Direct Asymmetric anti-Mannich-Type Reactions Catalyzed by a Designed Amino Acid. J. Am. Chem. Soc. 2006;128:1040–1041. doi: 10.1021/ja056984f. - DOI - PMC - PubMed
    1. Fleming E. M. Quigley C. Rozas I. Connon S. J. Computational Study-Led Organocatalyst Design: A Novel, Highly Active Urea-Based Catalyst for Addition Reactions to Epoxides. J. Org. Chem. 2008;73:948–956. doi: 10.1021/jo702154m. - DOI - PubMed