Machine Learning of Toxicological Big Data Enables Read-Across Structure Activity Relationships (RASAR) Outperforming Animal Test Reproducibility
- PMID: 30007363
- PMCID: PMC6135638
- DOI: 10.1093/toxsci/kfy152
Machine Learning of Toxicological Big Data Enables Read-Across Structure Activity Relationships (RASAR) Outperforming Animal Test Reproducibility
Abstract
Earlier we created a chemical hazard database via natural language processing of dossiers submitted to the European Chemical Agency with approximately 10 000 chemicals. We identified repeat OECD guideline tests to establish reproducibility of acute oral and dermal toxicity, eye and skin irritation, mutagenicity and skin sensitization. Based on 350-700+ chemicals each, the probability that an OECD guideline animal test would output the same result in a repeat test was 78%-96% (sensitivity 50%-87%). An expanded database with more than 866 000 chemical properties/hazards was used as training data and to model health hazards and chemical properties. The constructed models automate and extend the read-across method of chemical classification. The novel models called RASARs (read-across structure activity relationship) use binary fingerprints and Jaccard distance to define chemical similarity. A large chemical similarity adjacency matrix is constructed from this similarity metric and is used to derive feature vectors for supervised learning. We show results on 9 health hazards from 2 kinds of RASARs-"Simple" and "Data Fusion". The "Simple" RASAR seeks to duplicate the traditional read-across method, predicting hazard from chemical analogs with known hazard data. The "Data Fusion" RASAR extends this concept by creating large feature vectors from all available property data rather than only the modeled hazard. Simple RASAR models tested in cross-validation achieve 70%-80% balanced accuracies with constraints on tested compounds. Cross validation of data fusion RASARs show balanced accuracies in the 80%-95% range across 9 health hazards with no constraints on tested compounds.
Figures








Comment in
-
Software beats animal tests at predicting toxicity of chemicals.Nature. 2018 Jul;559(7713):163. doi: 10.1038/d41586-018-05664-2. Nature. 2018. PMID: 29995868 No abstract available.
-
Oy Vey! A Comment on "Machine Learning of Toxicological Big Data Enables Read-Across Structure Activity Relationships Outperforming Animal Test Reproducibility".Toxicol Sci. 2019 Jan 1;167(1):3-4. doi: 10.1093/toxsci/kfy286. Toxicol Sci. 2019. PMID: 30500930 Free PMC article. No abstract available.
References
-
- Adriaens E., Barroso J., Eskes C., Hoffmann S., McNamee P., Alepée N., Bessou‐Touya S., De Smedt A., De Wever B., Pfannenbecker U., et al. (2014). Retrospective analysis of the Draize test for serious eye damage/eye irritation: importance of understanding the in vivo endpoints under UN GHS/EU CLP for the development and evaluation of in vitro test methods. Arch. Toxicol. 88, 701–723. - PMC - PubMed
-
- Aulmann W., Pechacek N. (2014). Reach (and CLP). Its role in regulatory toxicology In Regulatory Toxicology (Reichl F.-X., Schwenk M., Eds.), pp. 779–795. Springer, Berlin, Heidelberg.
-
- Baker M. (2016). 1, 500 scientists lift the lid on reproducibility. Nature 533, 452–454. - PubMed
-
- Basketter D. A., Clewell H., Kimber I., Rossi A., Blaauboer B., Burrier R., Daneshian M., Eskes C., Goldberg A., Hasiwa N., et al. (2012). A roadmap for the development of alternative (non-animal) methods for systemic toxicity testing. ALTEX 29, 3–89. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources