Probing machine learning models based on high throughput experimentation data for the discovery of asymmetric hydrogenation catalysts
- PMID: 39211503
- PMCID: PMC11352728
- DOI: 10.1039/d4sc03647f
Probing machine learning models based on high throughput experimentation data for the discovery of asymmetric hydrogenation catalysts
Abstract
Enantioselective hydrogenation of olefins by Rh-based chiral catalysts has been extensively studied for more than 50 years. Naively, one would expect that everything about this transformation is known and that selecting a catalyst that induces the desired reactivity or selectivity is a trivial task. Nonetheless, ligand engineering or selection for any new prochiral olefin remains an empirical trial-error exercise. In this study, we investigated whether machine learning techniques could be used to accelerate the identification of the most efficient chiral ligand. For this purpose, we used high throughput experimentation to build a large dataset consisting of results for Rh-catalyzed asymmetric olefin hydrogenation, specially designed for applications in machine learning. We showcased its alignment with existing literature while addressing observed discrepancies. Additionally, a computational framework for the automated and reproducible quantum-chemistry based featurization of catalyst structures was created. Together with less computationally demanding representations, these descriptors were fed into our machine learning pipeline for both out-of-domain and in-domain prediction tasks of selectivity and reactivity. For out-of-domain purposes, our models provided limited efficacy. It was found that even the most expensive descriptors do not impart significant meaning to the model predictions. The in-domain application, while partly successful for predictions of conversion, emphasizes the need for evaluating the cost-benefit ratio of computationally intensive descriptors and for tailored descriptor design. Challenges persist in predicting enantioselectivity, calling for caution in interpreting results from small datasets. Our insights underscore the importance of dataset diversity with broad substrate inclusion and suggest that mechanistic considerations could improve the accuracy of statistical models.
This journal is © The Royal Society of Chemistry.
Conflict of interest statement
There are no conflicts to declare.
Figures







Similar articles
-
Iridium-Catalyzed Asymmetric Hydrogenation of Unsaturated Carboxylic Acids.Acc Chem Res. 2017 Apr 18;50(4):988-1001. doi: 10.1021/acs.accounts.7b00007. Epub 2017 Apr 4. Acc Chem Res. 2017. PMID: 28374998
-
Importance of Engineered and Learned Molecular Representations in Predicting Organic Reactivity, Selectivity, and Chemical Properties.Acc Chem Res. 2021 Feb 16;54(4):827-836. doi: 10.1021/acs.accounts.0c00745. Epub 2021 Feb 3. Acc Chem Res. 2021. PMID: 33534534
-
Highly Efficient Asymmetric Hydrogenation Catalyzed by Iridium Complexes with Tridentate Chiral Spiro Aminophosphine Ligands.Acc Chem Res. 2023 Feb 7;56(3):332-349. doi: 10.1021/acs.accounts.2c00764. Epub 2023 Jan 23. Acc Chem Res. 2023. PMID: 36689780
-
Asymmetric hydrogenation using monodentate phosphoramidite ligands.Acc Chem Res. 2007 Dec;40(12):1267-77. doi: 10.1021/ar7001107. Epub 2007 Aug 18. Acc Chem Res. 2007. PMID: 17705446 Review.
-
Catalytic asymmetric organozinc additions to carbonyl compounds.Chem Rev. 2001 Mar;101(3):757-824. doi: 10.1021/cr000411y. Chem Rev. 2001. PMID: 11712502 Review.
Cited by
-
Advances in Gasoline Hydrodesulfurization Catalysts: The Role of Structure-Activity Relationships and Machine Learning Approaches.ACS Omega. 2025 Jul 18;10(29):31262-31273. doi: 10.1021/acsomega.5c02980. eCollection 2025 Jul 29. ACS Omega. 2025. PMID: 40757365 Free PMC article. Review.
-
Data-Driven Virtual Screening of Conformational Ensembles of Transition-Metal Complexes.J Chem Theory Comput. 2025 May 27;21(10):5334-5345. doi: 10.1021/acs.jctc.5c00303. Epub 2025 May 9. J Chem Theory Comput. 2025. PMID: 40340435 Free PMC article.
References
-
- Horner L. Siegel H. Büthe H. Angew. Chem., Int. Ed. 1968;7:942. doi: 10.1002/anie.196809422. - DOI
-
- Knowles W. S. Sabacky M. J. Chem. Commun. 1968:1445–1446. doi: 10.1039/C19680001445. - DOI
-
- Marianov A. N. Jiang Y. Baiker A. Huang J. Chem. Catal. 2023;3:100631. doi: 10.1016/j.checat.2023.100631. - DOI
LinkOut - more resources
Full Text Sources