Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Dec 1;54(23):15546-15555.
doi: 10.1021/acs.est.0c05771. Epub 2020 Nov 19.

Comparing Machine Learning Models for Aromatase (P450 19A1)

Affiliations

Comparing Machine Learning Models for Aromatase (P450 19A1)

Kimberley M Zorn et al. Environ Sci Technol. .

Abstract

Aromatase, or cytochrome P450 19A1, catalyzes the aromatization of androgens to estrogens within the body. Changes in the activity of this enzyme can produce hormonal imbalances that can be detrimental to sexual and skeletal development. Inhibition of this enzyme can occur with drugs and natural products as well as environmental chemicals. Therefore, predicting potential endocrine disruption via exogenous chemicals requires that aromatase inhibition be considered in addition to androgen and estrogen pathway interference. Bayesian machine learning methods can be used for prospective prediction from the molecular structure without the need for experimental data. Herein, the generation and evaluation of multiple machine learning models utilizing different sources of aromatase inhibition data are described. These models are applied to two test sets for external validation with molecules relevant to drug discovery from the public domain. In addition, the performance of multiple machine learning algorithms was evaluated by comparing internal five-fold cross-validation statistics of the training data. These methods to predict aromatase inhibition from molecular structure, when used in concert with estrogen and androgen machine learning models, allow for a more holistic assessment of endocrine-disrupting potential of chemicals with limited empirical data and enable the reduction of the use of hazardous substances.

PubMed Disclaimer

Conflict of interest statement

Competing interests:

S.E. is owner, K.M.Z., D.H.F., and T.R.L., are employees of Collaborations Pharmaceuticals Inc. All other authors are SC Johnson and Son, Inc. employees.

Figures

Figure 1:
Figure 1:
Machine learning algorithm comparisons across multiple metrics for the training datasets described in Table 2. The radius of a given point reflects the value of the metric at each corner of the radar plot. The ToxCast assay corresponds to “NVS_ADME_hCYP19A1”, and the Tox21 assay to “TOX21_Aromatase_Inhibition”. Abbreviations: Combo = ToxCast and Tox21 assays combined, BFHC = burst-flag hit-call, HC = hit-call, Kappa = Cohen’s Kappa, MCC = Matthews Correlation Coefficient, AC = Assay Central® (Bayesian), rf = random forest, knn = k-Nearest Neighbors, svc = support vector classification, bnb = Naïve Bayesian, ada = AdaBoosted decision trees, DL = deep learning architecture.
Figure 2:
Figure 2:
Machine learning algorithm comparisons across multiple five-fold cross-validation metrics using either rank normalized scores (left) or ΔRNS (right). Box and whisker plots show individual points for those values that fall outside of the 5–95 percentile. Abbreviations: AC = Assay Central® (Bayesian), rf = Random Forest, knn = k-Nearest Neighbors, svc = Support Vector Classification, bnb = Naïve Bayesian, ada = AdaBoosted Decision Trees, DL = Deep Learning Architecture.

References

    1. EPA, U. EPA Endocrine Disruptor Screening Program Tier 1 Battery of Assays. https://www.epa.gov/endocrine-disruption/endocrine-disruptor-screening-p...
    1. Judson RS; Magpantay FM; Chickarmane V; Haskell C; Tania N; Taylor J; Xia M; Huang R; Rotroff DM; Filer DL; Houck KA; Martin MT; Sipes N; Richard AM; Mansouri K; Setzer RW; Knudsen TB; Crofton KM; Thomas RS, Integrated Model of Chemical Perturbations of a Biological Pathway Using 18 In Vitro High-Throughput Screening Assays for the Estrogen Receptor. Toxicol Sci 2015, 148 (1), 137–54. - PMC - PubMed
    1. Browne P; Judson RS; Casey WM; Kleinstreuer NC; Thomas RS, Screening Chemicals for Estrogen Receptor Bioactivity Using a Computational Model. Environ Sci Technol 2015, 49 (14), 8804–14. - PubMed
    1. Kleinstreuer NC; Ceger P; Watt ED; Martin M; Houck K; Browne P; Thomas RS; Casey WM; Dix DJ; Allen D; Sakamuru S; Xia M; Huang R; Judson R, Development and Validation of a Computational Model for Androgen Receptor Activity. Chem Res Toxicol 2017, 30 (4), 946–964. - PMC - PubMed
    1. Mansouri K; Grulke CM; Judson RS; Williams AJ, OPERA models for predicting physicochemical properties and environmental fate endpoints. J Cheminform 2018, 10 (1), 10. - PMC - PubMed

Publication types