Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 May 9;12(5):692.
doi: 10.3390/biology12050692.

Utilization of Computer Classification Methods for Exposure Prediction and Gene Selection in Daphnia magna Toxicogenomics

Affiliations

Utilization of Computer Classification Methods for Exposure Prediction and Gene Selection in Daphnia magna Toxicogenomics

Berkay Paylar et al. Biology (Basel). .

Abstract

Zinc (Zn) is an essential element that influences many cellular functions. Depending on bioavailability, Zn can cause both deficiency and toxicity. Zn bioavailability is influenced by water hardness. Therefore, water quality analysis for health-risk assessment should consider both Zn concentration and water hardness. However, exposure media selection for traditional toxicology tests are set to defined hardness levels and do not represent the diverse water chemistry compositions observed in nature. Moreover, these tests commonly use whole organism endpoints, such as survival and reproduction, which require high numbers of test animals and are labor intensive. Gene expression stands out as a promising alternative to provide insight into molecular events that can be used for risk assessment. In this work, we apply machine learning techniques to classify the Zn concentrations and water hardness from Daphnia magna gene expression by using quantitative PCR. A method for gene ranking was explored using techniques from game theory, namely, Shapley values. The results show that standard machine learning classifiers can classify both Zn concentration and water hardness simultaneously, and that Shapley values are a versatile and useful alternative for gene ranking that can provide insight about the importance of individual genes.

Keywords: Zn; bioavailability; biomarker; machine learning; water hardness.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
A schematic chart that summarizes the process of training a machine learning model on gene expression data.
Figure 2
Figure 2
The t-SNE plots for the data set used in this study are colored by (a) three water hardness classes, (b) five Zn level classes, and (c) both water and Zn level.
Figure 3
Figure 3
Error bar (mean and standard deviation on normalized raw values) for all 22 genes colored by water hardness.
Figure 4
Figure 4
Gene ranking using Shapley values based on contribution to prediction of water hardness (a) and Zn concentration (b). The genes are ranked by the sum of the total impact for all classes from most total impactful gene (top) to least total impactful gene (bottom) for class prediction.
Figure 5
Figure 5
Gene impact on model output for class (a) soft, (b) medium, and (c) hard water hardness using SHAP value impact on model output. The genes are ranked for each class from most impactful gene (top) to least impactful gene (bottom) for class prediction.
Figure 6
Figure 6
Gene impact on model output for class soft (a), medium (b), and hard water (c) prediction for a soft water sample, respectively. Machine learning predicted the sample class correctly as soft class by assigning highest probability (0.67) to soft water.
Figure 6
Figure 6
Gene impact on model output for class soft (a), medium (b), and hard water (c) prediction for a soft water sample, respectively. Machine learning predicted the sample class correctly as soft class by assigning highest probability (0.67) to soft water.

References

    1. EFSA Panel on Dietetic Products, Nutrition and Allergies (NDA) Scientific Opinion on Dietary Reference Values for Zinc. EFSA J. 2014;12:3844
    1. Fosmire G.J. Zinc toxicity. Am. J. Clin. Nutr. 1990;51:225–227. doi: 10.1093/ajcn/51.2.225. - DOI - PubMed
    1. OECD . Test No. 211: Daphnia Magna Reproduction Test. OECD; Paris, France: 2012.
    1. VanGuilder H.D., Vrana K.E., Freeman W.M. Twenty-five years of quantitative PCR for gene expression analysis. Biotechniques. 2008;44:619–626. doi: 10.2144/000112776. - DOI - PubMed
    1. Huang R., Ma C., Ma J., Huangfu X., He Q. Machine learning in natural and engineered water systems. Water Res. 2021;205:117666. doi: 10.1016/j.watres.2021.117666. - DOI - PubMed

LinkOut - more resources