. 2016 Jul;124(7):1023-33.

doi: 10.1289/ehp.1510267. Epub 2016 Feb 23.

CERAPP: Collaborative Estrogen Receptor Activity Prediction Project

Affiliations

PMID: 26908244
PMCID: PMC4937869
DOI: 10.1289/ehp.1510267

CERAPP: Collaborative Estrogen Receptor Activity Prediction Project

Kamel Mansouri et al. Environ Health Perspect. 2016 Jul.

. 2016 Jul;124(7):1023-33.

doi: 10.1289/ehp.1510267. Epub 2016 Feb 23.

Authors

Affiliation

¹ National Center for Computational Toxicology, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, USA.

PMID: 26908244
PMCID: PMC4937869
DOI: 10.1289/ehp.1510267

Abstract

Background: Humans are exposed to thousands of man-made chemicals in the environment. Some chemicals mimic natural endocrine hormones and, thus, have the potential to be endocrine disruptors. Most of these chemicals have never been tested for their ability to interact with the estrogen receptor (ER). Risk assessors need tools to prioritize chemicals for evaluation in costly in vivo tests, for instance, within the U.S. EPA Endocrine Disruptor Screening Program.

Objectives: We describe a large-scale modeling project called CERAPP (Collaborative Estrogen Receptor Activity Prediction Project) and demonstrate the efficacy of using predictive computational models trained on high-throughput screening data to evaluate thousands of chemicals for ER-related activity and prioritize them for further testing.

Methods: CERAPP combined multiple models developed in collaboration with 17 groups in the United States and Europe to predict ER activity of a common set of 32,464 chemical structures. Quantitative structure-activity relationship models and docking approaches were employed, mostly using a common training set of 1,677 chemical structures provided by the U.S. EPA, to build a total of 40 categorical and 8 continuous models for binding, agonist, and antagonist ER activity. All predictions were evaluated on a set of 7,522 chemicals curated from the literature. To overcome the limitations of single models, a consensus was built by weighting models on scores based on their evaluated accuracies.

Results: Individual model scores ranged from 0.69 to 0.85, showing high prediction reliabilities. Out of the 32,464 chemicals, the consensus model predicted 4,001 chemicals (12.3%) as high priority actives and 6,742 potential actives (20.8%) to be considered for further testing.

Conclusion: This project demonstrated the possibility to screen large libraries of chemicals using a consensus of different in silico approaches. This concept will be applied in future projects related to other end points.

Citation: Mansouri K, Abdelaziz A, Rybacka A, Roncaglioni A, Tropsha A, Varnek A, Zakharov A, Worth A, Richard AM, Grulke CM, Trisciuzzi D, Fourches D, Horvath D, Benfenati E, Muratov E, Wedebye EB, Grisoni F, Mangiatordi GF, Incisivo GM, Hong H, Ng HW, Tetko IV, Balabin I, Kancherla J, Shen J, Burton J, Nicklaus M, Cassotti M, Nikolov NG, Nicolotti O, Andersson PL, Zang Q, Politi R, Beger RD, Todeschini R, Huang R, Farag S, Rosenberg SA, Slavov S, Hu X, Judson RS. 2016.

Cerapp: Collaborative Estrogen Receptor Activity Prediction Project. Environ Health Perspect 124:1023-1033; http://dx.doi.org/10.1289/ehp.1510267.

PubMed Disclaimer

Conflict of interest statement

The views expressed in this paper are those of the authors and do not necessarily reflect the views or policies of the U.S. Environmental Protection Agency or the U.S. Food and Drug Administration.

The authors declare they have no actual or potential competing financial interests.

Figures

**Figure 1**
ROC curves of the categorical corrected consensus predictions for binding evaluated against different sets of the evaluation set with variable numbers of literature sources. The number of available chemicals in the evaluation set (between brackets) decreased with higher numbers of literature sources. The true and false positive rates are determined based on the number of actives in the different sets of the evaluation set.

**Figure 2**
Box-plot of the positive class potency levels in the corrected quantitative *consensus* predictions for binding. The concordance between models is the fraction of the number of models that agrees on the prediction of a certain chemical. Boxes extend from the 25th to the 75th percentile, horizontal bars represent the median, whiskers indicate the 10th and 90th percentiles, and outliers are represented as points.

**Figure 3**
Variation of the balanced accuracy of the corrected categorical consensus predictions for binding with positive concordance (agreement between models on predictions for active chemicals) threshold at different numbers of literature sources.

See this image and copyright information in PMC

References

1. Adler S, Basketter D, Creton S, Pelkonen O, van Benthem J, Zuang V, et al. Alternative (non-animal) methods for cosmetics testing: current status and future prospects—2010. Arch Toxicol. 2011;85:367–485. - PubMed
1. Attene-Ramos MS, Miller N, Huang R, Michael S, Itkin M, Kavlock RJ, et al. 2013. The Tox21 robotic platform for the assessment of environmental chemicals—from vision to reality. Drug Discov Today 18 716 723, doi: 10.1016/j.drudis.2013.05.015 - DOI - PMC - PubMed
1. Beger RD, Buzatu DA, Wilkes JG, Lay JO., Jr 13C NMR quantitative spectrometric data-activity relationship (QSDAR) models of steroids binding the aromatase enzyme. J Chem Inf Comput Sci. 2001;41:1360–1366. - PubMed
1. Beger RD, Wilkes JG. Developing 13C NMR quantitative spectrometric data-activity relationship (QSDAR) models of steroid binding to the corticosteroid binding globulin. J Comput Aided Mol Des. 2001;15:659–669. - PubMed
1. Berthold MR, Cebron N, Dill F, Gabriel TR, Kötter T, Meinl T, et al. In: Proceedings of the 31st Annual Conference of the Gesellschaft für Klassifikation e.V., Albert-Ludwigs-Universität Freiburg, 7–9 March 2007, Heidelberg, Germany. Studies in Classification, Data Analysis, and Knowledge Organization (Preisach C, Burkhardt H, Schmidt-Thieme L, Decker R, eds) Heidelberg, Germany: Springer; 2007. KNIME: the Konstanz Information Miner. pp. 319–326.

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

Grants and funding

T32 GM067553/GM/NIGMS NIH HHS/United States

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central
Other Literature Sources
- scite Smart Citations
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

CERAPP: Collaborative Estrogen Receptor Activity Prediction Project

Affiliation

CERAPP: Collaborative Estrogen Receptor Activity Prediction Project

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials

Miscellaneous