Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Oct 22;58(42):18822-18833.
doi: 10.1021/acs.est.4c05004. Epub 2024 Oct 11.

Predictive Understanding of Stream Salinization in a Developed Watershed Using Machine Learning

Affiliations

Predictive Understanding of Stream Salinization in a Developed Watershed Using Machine Learning

Jared D Smith et al. Environ Sci Technol. .

Abstract

Stream salinization is a global issue, yet few models can provide reliable salinity estimates for unmonitored locations at the time scales required for ecological exposure assessments. Machine learning approaches are presented that use spatially limited high-frequency monitoring and spatially distributed discrete samples to estimate the daily stream-specific conductance across a watershed. We compare the predictive performance of space- and time-unaware Random Forest models and space- and time-aware Recurrent Graph Convolution Neural Network models (KGE: 0.67 and 0.64, respectively) and use explainable artificial intelligence methods to interpret model predictions and understand salinization drivers. These models are applied to the Delaware River Basin, a developed watershed with diverse land uses that experiences anthropogenic salinization from winter deicer applications. These models capture seasonality for the winter first flush of deicers, and the streams with elevated predictions correspond well with indicators of deicer application. This result suggests that these models can be used to identify potential salinity-impaired streams for winter best management practices. Daily salinity predictions are driven primarily by land cover (urbanization) trends that may represent anthropogenic salinization processes and weather at time scales up to three months. Such modeling approaches are likely transferable to other watersheds and can be applied to further understand salinization risks and drivers.

Keywords: Delaware River Basin; deicers; explainable artificial intelligence (XAI); freshwater salinization; machine learning; seasonality; urban; watershed.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
(A) Overview of the machine learning modeling workflow that predicts specific conductance in nontidal stream reaches (purple). (B) Monthly and annual counts of specific conductance observations within training (blue) and testing (red) data sets. (C) Spatial locations of training and testing segments and segments that either did not have observations or were tidally influenced and not modeled. For all maps, stream segments are from the National Hydrologic Model, the basin outline is from the R nhdplusTools(32), package get_huc function, the U.S. state borders are from the R spData(33) package us_states data set, and the coordinate system is WGS84 with an Albers projection for the inset map. Additional data diagnostic figures are provided in Section S5.
Figure 2
Figure 2
Test set empirical cumulative distribution functions (CDFs) by modeled segment for (A) RMSE and (B) KGE for each of the model and attribute combinations. The vertical dashed line indicates the KGE for a model that predicts the mean of the observations. For reference, the mean (95% range) of the SC observations is 244 (18,610) μS/cm.
Figure 3
Figure 3
(A) Observed SC across the watershed for more forested (>75% upstream area) and other modeled segments with more than 100 observations from 2020 to 2022. Data are normalized by segment, and lighter blue colors indicate higher SC. Segments are plotted in descending order by mean SC. (B,C) Predictions for the best-performing models on a forested segment (139; Delaware River at Lordville, NY) and other segment (873; Wissahickon Creek at Mouth, Philadelphia, PA) in the test set, respectively. (D) Monthly and annual overall RMSE values for the test set. Additional model performance diagnostics are provided in Section S6 (Figures S11–S15).
Figure 4
Figure 4
(A,B) Winter (January, February, March) and other season RF predicted long-term mean (2000–2021) specific conductance (SC) overlain with observed Cl:Br locations in those seasons and time periods. Darker points indicate higher ratios. (C) Relationship between the predicted long-term mean SC and the by-segment mean Cl:Br. Ordinary least-squares regression lines and linear Pearson correlation coefficients are provided as a visual guide. Ratios are typically less than 300 for continental precipitation and most naturally occurring brines, around 300 for seawater, greater than 300 for most anthropogenic sources (agriculture, septic systems), and can be much greater for rock salt deicers. (D) Land cover map (201950) and change over time. Panel (C) with observed SC and RGCN results is provided in (Section S6 Figures S19 and S20).
Figure 5
Figure 5
SHAP values for the top 10 (A) static and (B) dynamic attributes displayed in order of overall model importance (top to bottom) for each predicted value (points). Attribute importance is calculated as the average absolute SHAP value. A SHAP value of 0 corresponds to the mean predicted SC value across all of the data points. Adding the overall mean to the summation of SHAP values for a single point equals the predicted SC for that point. Attributes correspond to the entire upstream watershed, unless they begin with CAT (immediate catchment only). (C) SHAP values (points) as a function of the most important attribute (upstream areal proportion of high density urban land use), and the segment mean Cl:Br as a function of the same attribute. Blue lines are smoothed fits to the data.

References

    1. Corsi S. R.; De Cicco L. A.; Lutz M. A.; Hirsch R. M. River Chloride Trends in Snow-Affected Urban Watersheds: Increasing Concentrations Outpace Urban Growth Rate and Are Common among All Seasons. Sci. Total Environ. 2015, 508, 488–497. 10.1016/j.scitotenv.2014.12.012. - DOI - PubMed
    1. Kaushal S. S.; Likens G. E.; Pace M. L.; Utz R. M.; Haq S.; Gorman J.; Grese M. Freshwater Salinization Syndrome on a Continental Scale. Proc. Natl. Acad. Sci. U. S. A. 2018, 115 (4), E574–E583 10.1073/pnas.1711234115. - DOI - PMC - PubMed
    1. Kaushal S. S.; Likens G. E.; Pace M. L.; Reimer J. E.; Maas C. M.; Galella J. G.; Utz R. M.; Duan S.; Kryger J. R.; Yaculak A. M.; et al. Freshwater Salinization Syndrome: From Emerging Global Problem to Managing Risks. Biogeochemistry 2021, 154 (2), 255–292. 10.1007/s10533-021-00784-w. - DOI
    1. Rumsey C. A.; Hammond J. C.; Murphy J.; Shoda M.; Soroka A. Spatial Patterns and Seasonal Timing of Increasing Riverine Specific Conductance from 1998 to 2018 Suggest Legacy Contamination in the Delaware River Basin. Sci. Total Environ. 2023, 858, 159691. 10.1016/j.scitotenv.2022.159691. - DOI - PubMed
    1. Cañedo-Argüelles M.; Kefford B.; Schäfer R. Salt in Freshwaters: Causes, Effects and Prospects - Introduction to the Theme Issue. Philos. Trans. R. Soc., B 2019, 374 (1764), 20180002. 10.1098/rstb.2018.0002. - DOI - PMC - PubMed

LinkOut - more resources