Bridging Machine Learning and Thermodynamics for Accurate p K a Prediction
- PMID: 39328749
- PMCID: PMC11423309
- DOI: 10.1021/jacsau.4c00271
Bridging Machine Learning and Thermodynamics for Accurate p K a Prediction
Abstract
Integrating scientific principles into machine learning models to enhance their predictive performance and generalizability is a central challenge in the development of AI for Science. Herein, we introduce Uni-pK a, a novel framework that successfully incorporates thermodynamic principles into machine learning modeling, achieving high-precision predictions of acid dissociation constants (pK a), a crucial task in the rational design of drugs and catalysts, as well as a modeling challenge in computational physical chemistry for small organic molecules. Uni-pK a utilizes a comprehensive free energy model to represent molecular protonation equilibria accurately. It features a structure enumerator that reconstructs molecular configurations from pK a data, coupled with a neural network that functions as a free energy predictor, ensuring high-throughput, data-driven prediction while preserving thermodynamic consistency. Employing a pretraining-finetuning strategy with both predicted and experimental pK a data, Uni-pK a not only achieves state-of-the-art accuracy in chemoinformatics but also shows comparable precision to quantum mechanics-based methods.
© 2024 The Authors. Published by American Chemical Society.
Conflict of interest statement
The authors declare no competing financial interest.
Figures
References
-
- Jablonka K. M.; Ai Q.; Al-Feghali A.; Badhwar S.; Bocarsly J. D.; Bran A. M.; Stefan Bringuier L.; Brinson C.; Choudhary K.; Circi D.; et al. 14 examples of how llms can transform materials science and chemistry: a reflection on a large language model hackathon. Digital Discovery 2023, 2 (5), 1233–1250. 10.1039/D3DD00113J. - DOI - PMC - PubMed
-
- Nandy A.; Duan C.; Kulik H. J. Audacity of huge: overcoming challenges of data scarcity and data quality for machine learning in computational materials discovery. Curr. Opin. Chem. Eng. 2022, 36, 100778 10.1016/j.coche.2021.100778. - DOI
-
- Frey N. C.; Soklaski R.; Axelrod S.; Samsi S.; Gomez-Bombarelli R.; Coley C. W.; Gadepally V. Neural scaling of deep chemical models.. Nat. Mach. Intell. 2023, 5 (11), 1297–1305. 10.1038/s42256-023-00740-3. - DOI
LinkOut - more resources
Full Text Sources