Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Mar 23;15(1):2603.
doi: 10.1038/s41467-024-46866-9.

Developing a machine learning model for accurate nucleoside hydrogels prediction based on descriptors

Affiliations

Developing a machine learning model for accurate nucleoside hydrogels prediction based on descriptors

Weiqi Li et al. Nat Commun. .

Abstract

Supramolecular hydrogels derived from nucleosides have been gaining significant attention in the biomedical field due to their unique properties and excellent biocompatibility. However, a major challenge in this field is that there is no model for predicting whether nucleoside derivative will form a hydrogel. Here, we successfully develop a machine learning model to predict the hydrogel-forming ability of nucleoside derivatives. The optimal model with a 71% (95% Confidence Interval, 0.69-0.73) accuracy is established based on a dataset of 71 reported nucleoside derivatives. 24 molecules are selected via the optimal model external application and the hydrogel-forming ability is experimentally verified. Among these, two rarely reported cation-independent nucleoside hydrogels are found. Based on their self-assemble mechanisms, the cation-independent hydrogel is found to have potential applications in rapid visual detection of Ag+ and cysteine. Here, we show the machine learning model may provide a tool to predict nucleoside derivatives with hydrogel-forming ability.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. To predict the ability of nucleoside derivatives to form hydrogels based on machine learning.
An optimal model was constructed for nucleoside derivatives hydrogel-forming ability prediction, and potential gelators were selected based on the optimal model external application and the hydrogel-forming ability were experimentally verified. Besides, the self-assembly mechanism of the cation-independent hydrogel was explored, which could be applied in rapid visual detection of Ag+ and cysteine.
Fig. 2
Fig. 2. The flowchart of model construction and feature selection of the descriptors.
a Flow chart of model construction. The results of 4175 descriptors were initially obtained, 144 descriptors with significant differences (P < 0.05) were selected by the rank-sum test, and 40 descriptors finally remained after excluding one of the pairs of descriptors with a correlation coefficient higher than 0.8 (Rho > 0.8) with Spearman correlation. b The results of the rank-sum test. With the logarithm of the P-value (log P-value) for the vertical coordinate, and the logarithm of the fold change (log FC) between the mean values of the gelator group and non-gelator group for the horizontal coordinate. c 40 descriptor correlation heatmaps. All correlations between descriptors were less than 0.80. d Three-dimensional (3D) principal component analysis (PCA) of 71 nucleoside derivatives with 4175 descriptors. The results of the PCA visualization with 4175 descriptors displayed of the gelator and non-gelator groups. e 3D PCA of 71 nucleoside derivatives with 40 descriptors. The results of the 3D PCA visualization with 40 descriptors displayed of two groups.
Fig. 3
Fig. 3. Evaluation indexes of different models and feature importance of optimal models.
a A scatterplot showed the distribution of AUC (area under the curve) and test accuracy for all models. The 4-point shapes represent different ML algorithms: extreme gradient boosting (XGBoost), logistic regression (LR), decision tree (DT), and random forest (RF). Descriptor’s part: Initially obtained 4175 descriptors,144 descriptors after rank sum test, 40 descriptors after correlation coefficient selection, and descriptors after recursive feature elimination (RFE). The optimal number of descriptors for RFE of each machine learning (ML) algorithm is different (XGBoost, n = 16; LR, n = 24; DT, n = 30; RF, n = 37), data are mean values ± standard error of the mean (SEM). b Evaluation indexes of four algorithms using descriptors after RFE. Combining the results of test accuracy, F1 score and AUC, data are mean values ± SEM. c Receiver operating characteristic curve for the four algorithms (LR, DT, RF, and XGBoost) using descriptors after RFE. d The RFE results of the LR models based on different descriptors within the 40 descriptors, indicated that LR with 24 descriptors had the best performance, data are mean values ± SEM. e The results of feature importance of 24 descriptors for the optimal LR model based on the regression coefficients.
Fig. 4
Fig. 4. Prediction and verification of untested nucleoside derivatives.
a 24 nucleoside derivatives were selected (12 high probability and 12 low probability) in a relatively homogeneous manner based on our experience and the costs of obtaining and synthesizing nucleoside derivatives. b 12 nucleoside derivatives with high probability of hydrogel-forming ability were selected. The result shows 10 nucleoside derivatives (1, 3, 4, 6, 7, 8, 9, 10, 11, and 12) formed hydrogels, while the two others (2 and 5) did not. 1, 1-[3,4-Dihydroxy-5- (hydroxymethyl) oxolan-2-yl]−1,3,5-triazinane-2,4,6-trione, DTT; 2, xanthosine, XTS; 3, guanine 5’-monophosphate, GMP; 4, inosine 5’-monophosphate, IMP; 5, 5-fluorouridine, 5-FUR; 6, 8-aminoguanosine, 8-AG; 7, 2’-deoxyguanosine 5’-monophosphate, dGMP; 8, 8-hydroxyguanosine, 8-OHG; 9, 8-azaguanosine, 8-azaG, 10, inosine-5’-carboxylic acid; I-5’-CA; 11, 2’-amino-2’-deoxyguanosine, 2’-NH2-dG, and 12, 2’-O-methylguanosine, 2’-OMe-dG.
Fig. 5
Fig. 5. The characterizations of hydrogels.
a Photographs of 8AG-T and 8OHG-T hydrogels were prepared for 6 months. b, c Evolution of G′ and G′′ as a function of frequency for of 8AG-T (b) and 8OHG-T (c) hydrogels. d, e The self-healing of 8AG-T (d) and 8OHG-T (e) hydrogels by rheological measurements. f Scanning electron microscopy (SEM, scale bar: 50 μm) images of 8AG-T and 8OHG-T hydrogels. g Atomic force microscopy (AFM, scale bar: 200 nm) images of 8AG-T and 8OHG-T hydrogels. h, i The pair distances distribution functions (PDDF) profiles from variable-temperature small-angle X-ray scattering (VT-SAXS) experiments of 8AG-T (h) 8OHG-T (i) hydrogels.
Fig. 6
Fig. 6. Self-assembly mechanism of the cation-independent hydrogels.
a 11B nuclear magnetic resonance (NMR) spectra of 8AG-T and 8OHG-T hydrogels. b Fluorescence intensity of Alizarin Red S (ARS) in 8AG-T and 8OHG-T hydrogels. c Thioflavin T (ThT) assay of 8AG-T and 8OHG-T hydrogels. d Circular dichroism spectra of 8AG-T and 8OHG-T hydrogels. e The chemical structure and single crystal structure of 6. f 1H–1H nuclear overhauser effect (NOE) of 8AG-T hydrogels. g The single crystal structure of the base-pair pattern. h The schematic diagram of the single crystal of 6. The red dashed box includes the interactions between dimethyl sulfoxide (DMSO) and 8AG. i The Powder X-ray diffractometry (PXRD) spectrum of 8AG-T and 8OHG-T hydrogels. j Schematic illustration of the formation of an 8AG-T hydrogel.
Fig. 7
Fig. 7. Detection of Ag+ and cysteine based on the 8OHG-T hydrogel.
a, b The fluorescence of the 8AG-T (a) and 8OHG-T (b) hydrogels after adding Rho123. c Photographs of the 8OHG-T hydrogels after adding ionic solutions. d 1H nuclear magnetic resonance (NMR) spectrophotometric titration of the 8OHG-T hydrogel with increasing Ag+. The peaks represent N1H of 8. e 1H NMR spectrophotometric titration of the 8OHG-T hydrogel with increasing Ag+. The peaks represent N7H of 8. f The process of the detection of Ag+ and cysteine based on the 8OHG-T hydrogel.

Similar articles

Cited by

References

    1. Li Y, et al. A Guanosine-Quadruplex Hydrogel as Cascade Reaction Container Consuming Endogenous Glucose for Infected Wound Treatment—A Study in Diabetic Mice. Adv. Sci. 2022;9:2103485. doi: 10.1002/advs.202103485. - DOI - PMC - PubMed
    1. Zhao H, et al. Dual-functional guanosine-based hydrogel integrating localized delivery and anticancer activities for cancer therapy. Biomaterials. 2020;230:119598. doi: 10.1016/j.biomaterials.2019.119598. - DOI - PubMed
    1. Ramin MA, et al. Cation Tuning of Supramolecular Gel Properties: A New Paradigm for Sustained Drug Delivery. Adv. Mater. 2017;29:1605227. doi: 10.1002/adma.201605227. - DOI - PubMed
    1. Bang I. Examination or the guanyle acid. Biochem. Z. 1910;26:293–311.
    1. Wang Z, et al. High-Strength and Injectable Supramolecular Hydrogel Self-Assembled by Monomeric Nucleoside for Tooth-Extraction Wound Healing. Adv. Mater. 2022;34:e2108300. doi: 10.1002/adma.202108300. - DOI - PubMed