Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Sep;11(35):e2405596.
doi: 10.1002/advs.202405596. Epub 2024 Jul 17.

AI-Powered Mining of Highly Customized and Superior ESIPT-Based Fluorescent Probes

Affiliations

AI-Powered Mining of Highly Customized and Superior ESIPT-Based Fluorescent Probes

Wenzhi Huang et al. Adv Sci (Weinh). 2024 Sep.

Abstract

Excited-state intramolecular proton transfer (ESIPT) has attracted great attention in fluorescent sensors and luminescent materials due to its unique photobiological and photochemical features. However, the current structures are far from meeting the specific demands for ESIPT molecules in different scenarios; the try-and-error development method is labor-intensive and costly. Therefore, it is imperative to devise novel approaches for the exploration of promising ESIPT fluorophores. This research proposes an artificial intelligence approach aiming at exploring ESIPT molecules efficiently. The first high-quality ESIPT dataset and a multi-level prediction system are constructed that realized accurate identification of ESIPT molecules from a large number of compounds under a stepwise distinguishing from conventional molecules to fluorescent molecules and then to ESIPT molecules. Furthermore, key structural features that contributed to ESIPT are revealed by using the SHapley Additive exPlanations (SHAP) method. Then three strategies are proposed to ensure the ESIPT process while keeping good safety, pharmacokinetic properties, and novel structures. With these strategies, >700 previously unreported ESIPT molecules are screened from a large pool of 570 000 compounds. The ESIPT process and biosafety of optimal molecules are successfully validated by quantitative calculation and experiment. This novel approach is expected to bring a new paradigm for exploring ideal ESIPT molecules.

Keywords: ESIPT; artificial intelligence; fluorescent probe; machine learning; virtual screening.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Schematic of hunting for promising ESIPT molecules with artificial intelligence, followed by quantitative calculation and wet experimental validation of the promising ESIPT molecules.
Figure 2
Figure 2
Results of models using different molecular descriptors and algorithms. The heatmap illustrates the ACC of the test set for A) E‐CM and B) E‐FL models constructed with various molecular descriptors and algorithms. C) The plot of predicted versus experimental value of the E‐Barrier model by different algorithms. D) The plot of predicted versus experimental value of the best E‐Barrier model by RF algorithm. ROC curve plots for the best E) E‐CM and F) E‐FL models in CV and the test set. G) Fold rate plot for the best E‐Barrier model.
Figure 3
Figure 3
Model interpretation and structural analysis for ESIPT process. A) SHAP values for the top 20 features of the E‐CM model using MACCS‐RF. B) SHAP values for the top 20 features of the E‐FL model using MACCS‐RF. C) SHAP values for the top 20 features of the E‐CM model using 2D‐RF. D) SHAP values for the top 20 features of the E‐FL model using 2D‐RF. E) Venn diagram showing the overlapping top 20 important features in MACCS‐RF for E‐CM and E‐FL. F) Venn diagram showing the overlapping top 20 important features in 2D‐RF for E‐CM and E‐FL. G) Visualization of partially important overlapping MACCS fingerprints in E‐CM and E‐FL using MACCS‐RF. H) Conducted SHAP force diagram analysis on HBT utilizing the MACCS‐RF combination in the E‐FL model. I) Conducted SHAP force diagram analysis on 5,6‐dihydrobenzo[c]acridin‐1‐ol utilizing the MACCS‐RF combination in the E‐FL model.
Figure 4
Figure 4
The screening of novel ESIPT molecules. A) Workflow of the multi‐level prediction system. B) Top‐ranked molecules in the screening results under different scoring strategies.
Figure 5
Figure 5
Property analysis results of candidate molecules. A) Inset (A) illustrates the top 100 essential physicochemical properties among three screening strategies. nHD represents the number of proton acceptors, nHA represents the number of proton donors, TPSA describes the physicochemical property of molecular polarity, logP indicates the partition coefficient between oil and water, and MW represents molecular weight. B) Four classic ESIPT fluorescent molecules, HBI, HBT, Quinoline and N‐salicylidene aniline. C) Comparison of the safety score of molecules ranked in the top 100 in toxicity and safety strategies with four classical fluorescent molecules. D) Comparison of the diversity score of the collected ESIPT positive set and three evaluation strategies. T: Toxicity and safety, S: Structural innovation, P: Pharmacokinetics. E) Comparison of the pharmacokinetic score of molecules ranked in the top 100 in pharmacokinetic properties strategy with four classical fluorescent molecules. F) Ten reported ESIPT molecules in our group. G) Candidate ESIPT molecules FL‐1, FL‐2, and FL‐3. H–J) Show the distribution of synthesizability, LogD, and LogS in the collected set of 922 ESIPT molecules for FL‐1, FL‐2, and FL‐3, respectively.
Figure 6
Figure 6
Validation results of candidate ESIPT molecules. A) S1 state energy distribution of FL‐1 along the O‐H distance in DMSO solution, optimized using PBE0/Def2‐svp. B) S1 state energy distribution of FL‐2 along the O‐H distance in DMSO solution, optimized using PBE0/Def2‐svp. C) S1 state energy distribution of FL‐3 along the O‐H distance in DMSO solution, optimized using PBE0/Def2‐svp. D) The LUMO and HOMO of molecule FL‐1 in the S1 state and the corresponding transition energies at the PBE0/Def2‐svp/IEFPCM levels. E) The LUMO and HOMO of molecule FL‐2 in the S1 state and the corresponding transition energies at the PBE0/Def2‐svp/IEFPCM levels. F) The LUMO and HOMO of molecule FL‐3 in the S1 state and the corresponding transition energies at the PBE0/Def2‐svp/IEFPCM levels. G) Schematic representation of the ESIPT process for FL‐1, FL‐2, and FL‐3. H) Absorption spectra of FL‐3 in different solvents. I) Emission spectra of FL‐3 in different solvents. J) Hemolysis percentage of red blood cells (RBCs) treated with different concentrations of FL‐3. K) Cell viability of A549 cells incubated with FL‐3.

Similar articles

Cited by

References

    1. Sedgwick A. C., Dou W. T., Jiao J. B., Wu L., Williams G. T., Jenkins A. T. A., Bull S. D., Sessler J. L., He X. P., James T. D., J. Am. Chem. Soc. 2018, 140, 14267. - PubMed
    1. Hu R., Feng J., Hu D., Wang S., Li S., Li Y., Yang G., Angew. Chem., Int. Ed. Engl. 2010, 49, 4915. - PubMed
    1. Fu P.‐Y., Yi S.‐Z., Pan M., Su C.‐Y., Acc. Mater. Res. 2023, 4, 939.
    1. Huang Q., Guo Q., Lan J., Su R., Ran Y., Yang Y., Bin Z., You J., Mater. Horiz. 2021, 8, 1499. - PubMed
    1. Hsieh C.‐C., Jiang C.‐M., Chou P.‐T., Acc. Chem. Res. 2010, 43, 1364. - PubMed

LinkOut - more resources