Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 25;65(16):8426-8434.
doi: 10.1021/acs.jcim.4c02399. Epub 2025 Aug 10.

Advancing Aqueous Solubility Prediction: A Machine Learning Approach for Organic Compounds Using a Curated Data Set

Affiliations

Advancing Aqueous Solubility Prediction: A Machine Learning Approach for Organic Compounds Using a Curated Data Set

Mushtaq Ali et al. J Chem Inf Model. .

Abstract

Aqueous solubility is one key property of a chemical compound that determines its possible use in different applications, from drug development to materials sciences. In this work, we present a model for the prediction of aqueous solubility that leverages a curated data set merged from four distinct sources. This data set encompasses a diverse range of organic compounds, providing a robust foundation for our investigation of solubility prediction. Our approach involves employing a variety of machine learning and deep learning models that combine an extensive array of chemical descriptors, fingerprints, and functional groups. This methodology is designed to address the complexities of solubility prediction and is tailored to achieve high accuracy and generalization. We tested the finalized model on a diverse data set of 1282 unique organic compounds from the Huuskonen data set. The results of our analysis demonstrate the success of our model, which, given an R2 value of 0.92 and an MAE value of 0.40, outperforms existing prediction methods for aqueous solubility on one of the most diverse data sets in the field.

PubMed Disclaimer

Similar articles

References

    1. Llompart P.. et al. Will we ever be able to accurately predict solubility? Sci. Data. 2024;11:303. doi: 10.1038/s41597-024-03105-6. - DOI - PMC - PubMed
    1. Lipinski C. A., Lombardo F., Dominy B. W., Feeney P. J.. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Delivery Rev. 2001;46:3–26. doi: 10.1016/S0169-409X(00)00129-0. - DOI - PubMed
    1. Luo J., Su Q., Zhai X., Zou Y., Yu Q.. An improved gravimetric method with anti-solvent addition to measure the solubility of d-allulose in water. J. Food Eng. 2023;355:111582. doi: 10.1016/j.jfoodeng.2023.111582. - DOI
    1. Jain N., Yalkowsky S. H.. Estimation of the aqueous solubility I: application to organic nonelectrolytes. J. Pharm. Sci. 2001;90:234–252. doi: 10.1002/1520-6017(200102)90:2<234::AID-JPS14>3.0.CO;2-V. - DOI - PubMed
    1. Hückel W.. Solubility of non-electrolytes. Von Prof. Joel H. Hildebrand. 203 Seiten. Reinhold Publishing Corporation, New York 1936. Preis geb. $4,50. Angew. Chem. Weinheim Bergstr. Ger. 1936;49:703–704. doi: 10.1002/ange.19360493815. - DOI

LinkOut - more resources