Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jan 4:13:1087273.
doi: 10.3389/fgene.2022.1087273. eCollection 2022.

A machine learning-based approach to ERα bioactivity and drug ADMET prediction

Affiliations

A machine learning-based approach to ERα bioactivity and drug ADMET prediction

Tianbo An et al. Front Genet. .

Abstract

By predicting ERα bioactivity and mining the potential relationship between Absorption, Distribution, Metabolism, Excretion, Toxicity (ADMET) attributes in drug research and development, the development efficiency of specific drugs for breast cancer will be effectively improved and the misjudgment rate of R&D personnel will be reduced. The quantitative prediction model of ERα bioactivity and classification prediction model of Absorption, Distribution, Metabolism, Excretion, Toxicity properties were constructed. The prediction results of ERα bioactivity were compared by XGBoot, Light GBM, Random Forest and MLP neural network. Two models with high prediction accuracy were selected and fused to obtain ERα bioactivity prediction model from Mean absolute error (MAE), mean squared error (MSE) and R2. The data were further subjected to model-based feature selection and FDR/FPR-based feature selection, respectively, and the results were placed in a voting machine to obtain Absorption, Distribution, Metabolism, Excretion, Toxicity classification prediction model. In this study, 430 molecular descriptors were removed, and finally 20 molecular descriptors with the most significant effect on biological activity obtained by the dual feature screening combined optimization method were used to establish a compound molecular descriptor prediction model for ERα biological activity, and further classification and prediction of the Absorption, Distribution, Metabolism, Excretion, Toxicity properties of the drugs were made. Eighty variables were selected by the model ExtraTreesClassifier Classifie, and 40 variables were selected by the model GradientBoostingClassifier to complete the model-based feature selection. At the same time, the feature selection method based on FDR/FPR is also selected, and the three classification models obtained by the two methods are placed into the voting machine to obtain the final model. The experimental results showed that the model's evaluation indexes and roc diagram were excellent and could accurately predict ERα bioactivity and Absorption, Distribution, Metabolism, Excretion, Toxicity properties. The model constructed in this study has high accuracy, fast convergence and robustness, has a very high accuracy for Absorption, Distribution, Metabolism, Excretion, Toxicity and ERα classification prediction, has bright prospects in the biopharmaceutical field, and is an important method for energy conservation and yield increase in the future.

Keywords: ADMET; ERα bioactivity; breast cancer; drug development; machine learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Flow chart of combined optimization method for dual feature screening.
FIGURE 2
FIGURE 2
Framework of ERα bioactivity prediction model.
FIGURE 3
FIGURE 3
Schematic diagram of the classification prediction model framework.
FIGURE 4
FIGURE 4
Scatter plot of ALogP.
FIGURE 5
FIGURE 5
Schematic diagram of the box line diagram.
FIGURE 6
FIGURE 6
Statistical results of 20 bioactive molecules based on Grey correlation analysis.
FIGURE 7
FIGURE 7
Statistical results of 20 bioactive molecules based on Spearman rank correlation coefficient analysis.
FIGURE 8
FIGURE 8
Feature variables and pIC50 fit results.
FIGURE 9
FIGURE 9
Classification model roc curve with hERG as the target value.
FIGURE 10
FIGURE 10
Classification model roc curve with MN as the target value.

Similar articles

Cited by

References

    1. Ali S., Coombes R. C. (2002). Endocrine-responsive breast cancer and strategies for combating resistance. Nat. Rev. Cancer 2 (2), 101–112. 10.1038/nrc721 - DOI - PubMed
    1. Bolboaca S. D., Jäntschi L. (1900). Comparison of quantitative structure-activity relationship model performances on carboquinone derivatives. Sci. Worl. J. 9, 1148–1166. 10.1100/tsw.2009.131 - DOI - PMC - PubMed
    1. Casteleiro-Roca J. L., Jove E., Gonzalez-Cava J. M., Mendez Perez J. A., Calvo-Rolle J. L., Blanco Alvarez F. (2020). Hybrid model for the ANI index prediction using Remifentanil drug and EMG signal. Neural comput. Appl. 32, 1249–1258. 10.1007/s00521-018-3605-z - DOI
    1. Chang Y. H., Chen J. Y., Hor C. Y., Chuang Y. C., Yang C. B., Yang C. N. (2013). Computational study of estrogen receptor-alpha antagonist with three-dimensional quantitative structure-activity relationship, support vector regression, and linear regression methods. Int. J. Med. Chem. 2013, 743139. 10.1155/2013/743139 - DOI - PMC - PubMed
    1. Dejun Y., Xiangcao Y., Chongyuan X., Shilin Z., Huang (2018). Molecular docking of uric acid-lowering activity and ADMET properties of small molecule compounds from red fennel. Chin. J. Clin. Pharmacol. 34 (23), 2750–2752+2777. 10.13699/j.cnki.1001-6821.2018.23.019 - DOI

LinkOut - more resources