Identification of biomarkers that distinguish chemical contaminants based on gene expression profiles
- PMID: 24678894
- PMCID: PMC4051169
- DOI: 10.1186/1471-2164-15-248
Identification of biomarkers that distinguish chemical contaminants based on gene expression profiles
Abstract
Background: High throughput transcriptomics profiles such as those generated using microarrays have been useful in identifying biomarkers for different classification and toxicity prediction purposes. Here, we investigated the use of microarrays to predict chemical toxicants and their possible mechanisms of action.
Results: In this study, in vitro cultures of primary rat hepatocytes were exposed to 105 chemicals and vehicle controls, representing 14 compound classes. We comprehensively compared various normalization of gene expression profiles, feature selection and classification algorithms for the classification of these 105 chemicals into14 compound classes. We found that normalization had little effect on the averaged classification accuracy. Two support vector machine (SVM) methods, LibSVM and sequential minimal optimization, had better classification performance than other methods. SVM recursive feature selection (SVM-RFE) had the highest overfitting rate when an independent dataset was used for a prediction. Therefore, we developed a new feature selection algorithm called gradient method that had a relatively high training classification as well as prediction accuracy with the lowest overfitting rate of the methods tested. Analysis of biomarkers that distinguished the 14 classes of compounds identified a group of genes principally involved in cell cycle function that were significantly downregulated by metal and inflammatory compounds, but were induced by anti-microbial, cancer related drugs, pesticides, and PXR mediators.
Conclusions: Our results indicate that using microarrays and a supervised machine learning approach to predict chemical toxicants, their potential toxicity and mechanisms of action is practical and efficient. Choosing the right feature and classification algorithms for this multiple category classification and prediction is critical.
Figures









Similar articles
-
An Efficient Feature Selection Strategy Based on Multiple Support Vector Machine Technology with Gene Expression Data.Biomed Res Int. 2018 Aug 30;2018:7538204. doi: 10.1155/2018/7538204. eCollection 2018. Biomed Res Int. 2018. PMID: 30228989 Free PMC article.
-
Ensemble Feature Learning of Genomic Data Using Support Vector Machine.PLoS One. 2016 Jun 15;11(6):e0157330. doi: 10.1371/journal.pone.0157330. eCollection 2016. PLoS One. 2016. PMID: 27304923 Free PMC article.
-
Recursive cluster elimination (RCE) for classification and feature selection from gene expression data.BMC Bioinformatics. 2007 May 2;8:144. doi: 10.1186/1471-2105-8-144. BMC Bioinformatics. 2007. PMID: 17474999 Free PMC article.
-
A machine learning approach for the identification of key markers involved in brain development from single-cell transcriptomic data.BMC Genomics. 2016 Dec 22;17(Suppl 13):1025. doi: 10.1186/s12864-016-3317-7. BMC Genomics. 2016. PMID: 28155657 Free PMC article.
-
Incorporating Pathway Information into Feature Selection towards Better Performed Gene Signatures.Biomed Res Int. 2019 Apr 3;2019:2497509. doi: 10.1155/2019/2497509. eCollection 2019. Biomed Res Int. 2019. PMID: 31073522 Free PMC article. Review.
Cited by
-
Functional Genomics, Genetics, and Bioinformatics 2016.Biomed Res Int. 2016;2016:2625831. doi: 10.1155/2016/2625831. Epub 2016 Nov 22. Biomed Res Int. 2016. PMID: 27995138 Free PMC article. No abstract available.
-
Systematic approaches to machine learning models for predicting pesticide toxicity.Heliyon. 2024 Mar 25;10(7):e28752. doi: 10.1016/j.heliyon.2024.e28752. eCollection 2024 Apr 15. Heliyon. 2024. PMID: 38576573 Free PMC article. Review.
-
Application of the TGx-28.65 transcriptomic biomarker to classify genotoxic and non-genotoxic chemicals in human TK6 cells in the presence of rat liver S9.Environ Mol Mutagen. 2016 May;57(4):243-60. doi: 10.1002/em.22004. Epub 2016 Mar 4. Environ Mol Mutagen. 2016. PMID: 26946220 Free PMC article.
-
Crosstalk between Receptor and Non-receptor Mediated Chemical Modes of Action in Rat Livers Converges through a Dysregulated Gene Expression Network at Tumor Suppressor Tp53.Front Genet. 2017 Oct 24;8:157. doi: 10.3389/fgene.2017.00157. eCollection 2017. Front Genet. 2017. PMID: 29114260 Free PMC article.
-
Prioritization of Contaminants of Emerging Concern in Wastewater Treatment Plant Discharges Using Chemical:Gene Interactions in Caged Fish.Environ Sci Technol. 2017 Aug 1;51(15):8701-8712. doi: 10.1021/acs.est.7b01567. Epub 2017 Jul 17. Environ Sci Technol. 2017. PMID: 28651047 Free PMC article.
References
-
- Huang R, Southall N, Xia M, Cho MH, Jadhav A, Nguyen DT, Inglese J, Tice RR, Austin CP. Weighted feature significance: a simple, interpretable model of compound toxicity based on the statistical enrichment of structural features. Toxicol Sci. 2009;15:385–393. doi: 10.1093/toxsci/kfp231. - DOI - PMC - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources