Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Feb 23;21(4):1523.
doi: 10.3390/ijms21041523.

Discovery of Small-Molecule Activators for Glucose-6-Phosphate Dehydrogenase (G6PD) Using Machine Learning Approaches

Affiliations

Discovery of Small-Molecule Activators for Glucose-6-Phosphate Dehydrogenase (G6PD) Using Machine Learning Approaches

Madhu Sudhana Saddala et al. Int J Mol Sci. .

Abstract

Glucose-6-Phosphate Dehydrogenase (G6PD) is a ubiquitous cytoplasmic enzyme converting glucose-6-phosphate into 6-phosphogluconate in the pentose phosphate pathway (PPP). The G6PD deficiency renders the inability to regenerate glutathione due to lack of Nicotine Adenosine Dinucleotide Phosphate (NADPH) and produces stress conditions that can cause oxidative injury to photoreceptors, retinal cells, and blood barrier function. In this study, we constructed pharmacophore-based models based on the complex of G6PD with compound AG1 (G6PD activator) followed by virtual screening. Fifty-three hit molecules were mapped with core pharmacophore features. We performed molecular descriptor calculation, clustering, and principal component analysis (PCA) to pharmacophore hit molecules and further applied statistical machine learning methods. Optimal performance of pharmacophore modeling and machine learning approaches classified the 53 hits as drug-like (18) and nondrug-like (35) compounds. The drug-like compounds further evaluated our established cheminformatics pipeline (molecular docking and in silico ADMET (absorption, distribution, metabolism, excretion and toxicity) analysis). Finally, five lead molecules with different scaffolds were selected by binding energies and in silico ADMET properties. This study proposes that the combination of machine learning methods with traditional structure-based virtual screening can effectively strengthen the ability to find potential G6PD activators used for G6PD deficiency diseases. Moreover, these compounds can be considered as safe agents for further validation studies at the cell level, animal model, and even clinic setting.

Keywords: ADMET; G6PD; docking; machine learning; pharmacophore modeling.

PubMed Disclaimer

Conflict of interest statement

The authors have no competing interests.

Figures

Figure 1
Figure 1
Depiction of the target enzyme G6PD and its active sites. (A) the target enzyme G6PD-dimer represented as a cartoon model; (B) the target enzyme G6PD-dimer represented as a surface model. Monomer-1 (cyan color) and monomer-2 (yellow color) are connected and formed dimer formation dimer interface active site (pink color), dark red color spheres designate the NADP+ binding sites. The dimer interface is designated with red.
Figure 2
Figure 2
Pharmacophore model of G6PD-AG1-complex. (A) The pharmacophore model contains four pharmacophore features, such as two hydrogen donors (green color), one positive ionizable, and one aromatic ring (B). The G6PD-AG1 compound interacts with His513, ASP421, and ARG427 functional residues. The 53 hit molecules are fitted into pharmacophore features which are applied to the PubChem database.
Figure 3
Figure 3
The molecular descriptors and clustering of 53 hit molecules. (A) The molecular descriptors were represented as a heatmap. It showed positive values as a red color and negative values as a blue color. (B) The molecular descriptors were classified as hierarchical clustering trees. The hierarchical clustering showed seven cluster trees. Cluster 1 has four compounds, Cluster 2 and Cluster 3 have ten compounds each, Cluster 4 has fourteen compounds, Cluster 5 has four compounds, Cluster 6 has five compounds, and Cluster 7 has seven compounds, respectively.
Figure 4
Figure 4
Principle component analysis (PCA) of 53 pharmacophores hit molecules. (A) The PCA showed various groups of compounds based on the Tanimoto coefficient (distance) between the first component (PCA1) against the second component (PCA2). (B) The logarithm of the calculated partition coefficient (logP) against the polar surface area (PSA) showed that the compounds have a maximum of 5.8 logP and 66 PSA. (C) The molecular weight (MW) against the PSA showed that the compounds have a maximum of 66 PSA and 400 MW. (D) The molecular weight (MW) against the logarithm of the calculated Partition coefficient (logP) showed that the compounds have a maximum of 5.8 logP and 400 MW.
Figure 5
Figure 5
The statistical machine learning predictions and classified the 53 pharmacophore hit candidate molecules as a drug-like (18) and nondrug-like (35) compounds.
Figure 6
Figure 6
Molecular docking analysis illustrated all the drug-like (18) molecules docked into the G6PD-dimer interface active site (top). The binding energy of the top five compounds was aligned and superimposed (bottom).
Figure 7
Figure 7
Protein–ligand interaction analysis of the best five compounds. (A) CID6917760 compound interacted with (ne–8.9 kcal/mole) active site of G6PD (dimer interface domain) functional residues; (B) CID9820229 compound interacted to (ne–7.6 kcal/mole) active site of G6PD (dimer interface domain) functional residues; (C) CID5221957 compound interacted with (ne–7.3 kcal/mole) active site of G6PD (dimer interface domain) functional residues; (D) CID389556 compound interacted to (ne–7.2 kcal/mole) active site of G6PD (dimer interface domain) functional residues; (E) CID10900930 compound interacted to (ne–7.0 kcal/mole) active site of G6PD (dimer interface domain) functional residues; (F) AG1 (CID6615809) compound interacted to (ne–6.1 kcal/mole) active site of G6PD (dimer interface domain) functional residues. The binding site functional residues represented as a sticks model with rainbow color; the best top five compounds were represented as a sticks model with magenta, and the G6PD protein represented as a cartoon model with white.
Figure 8
Figure 8
The ADMET properties of the five best G6PD small molecule activators CID6917760, CID9820229, CID5221957, CID389556, CID10900930, and AG1 (CID6615809). The pink area represents the optimal range for each properties (lipophilicity: XLOGP3 between ne−0.7 and +5.0, size: MW between 150 and 500 g/mol, polarity: TPSA between 20 and 130 Å2, solubility: log S not higher than 6, saturation: fraction of carbons in the sp3 hybridization not less than 0.25, and flexibility: no more than 9 rotatable bonds.

References

    1. Cappellini M.D., Fiorelli G. Glucose-6-phosphate dehydrogenase deficiency. Lancet. 2008;371:64–74. doi: 10.1016/S0140-6736(08)60073-2. - DOI - PubMed
    1. Carvalho C.G., Castro S.M., Santin A.P., Zaleski C., Carvalho F.G., Giugliani R. Glucose-6-phosphate-dehydrogenase deficiency and its correlation with other risk factors in jaundiced newborns in southern brazil. Asian Pac. J. Trop. Biomed. 2011;1:110–113. doi: 10.1016/S2221-1691(11)60006-3. - DOI - PMC - PubMed
    1. Tsai K.J., Hung I.J., Chow C.K., Stern A., Chao S.S., Chiu D.T. Impaired production of nitric oxide, superoxide, and hydrogen peroxide in glucose 6-phosphate-dehydrogenase-deficient granulocytes. FEBS Lett. 1998;436:411–414. doi: 10.1016/S0014-5793(98)01174-0. - DOI - PubMed
    1. Naylor C.E., Rowland P., Basak A.K., Gover S., Mason P.J., Bautista J.M., Vulliamy T.J., Luzzatto L., Adams M.J. Glucose 6-phosphate dehydrogenase mutations causing enzyme deficiency in a model of the tertiary structure of the human enzyme. Blood. 1996;87:2974–2982. doi: 10.1182/blood.V87.7.2974.bloodjournal8772974. - DOI - PubMed
    1. Au S.W., Gover S., Lam V.M., Adams M.J. Human glucose-6-phosphate dehydrogenase: The crystal structure reveals a structural nadp(+) molecule and provides insights into enzyme deficiency. Structure. 2000;8:293–303. doi: 10.1016/S0969-2126(00)00104-0. - DOI - PubMed

MeSH terms