Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023;15(1):9-32.
doi: 10.1007/s41060-022-00347-8. Epub 2022 Aug 31.

From data to interpretable models: machine learning for soil moisture forecasting

Affiliations

From data to interpretable models: machine learning for soil moisture forecasting

Aniruddha Basak et al. Int J Data Sci Anal. 2023.

Abstract

Soil moisture is critical to agricultural business, ecosystem health, and certain hydrologically driven natural disasters. Monitoring data, though, is prone to instrumental noise, wide ranging extrema, and nonstationary response to rainfall where ground conditions change. Furthermore, existing soil moisture models generally forecast poorly for time periods greater than a few hours. To improve such forecasts, we introduce two data-driven models, the Naive Accumulative Representation (NAR) and the Additive Exponential Accumulative Representation (AEAR). Both of these models are rooted in deterministic, physically based hydrology, and we study their capabilities in forecasting soil moisture over time periods longer than a few hours. Learned model parameters represent the physically based unsaturated hydrological redistribution processes of gravity and suction. We validate our models using soil moisture and rainfall time series data collected from a steep gradient, post-wildfire site in southern California. Data analysis is complicated by rapid landscape change observed in steep, burned hillslopes in response to even small to moderate rain events. The proposed NAR and AEAR models are, in forecasting experiments, shown to be competitive with several established and state-of-the-art baselines. The AEAR model fits the data well for three distinct soil textures at variable depths below the ground surface (5, 15, and 30 cm). Similar robust results are demonstrated in controlled, laboratory-based experiments. Our AEAR model includes readily interpretable hydrologic parameters and provides more accurate forecasts than existing models for time horizons of 10-24 h. Such extended periods of warning for natural disasters, such as floods and landslides, provide actionable knowledge to reduce loss of life and property.

Keywords: Data analysis; Interpretable machine learning; Model optimization and fitting; Monitoring; Post-fire landslides; Soil moisture forecasting.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Left: Destructive post-Thomas Fire debris flow impacts of January 9, 2018 in Montecito, California. Right: U.S. Geological Survey researcher responds to deadly January 9, 2018, debris flows in Montecito, California. Distributed rainfall runoff in the absence of vegetation allows readily erodible material to transport downslope gaining mass and velocity (momentum) through the channel network resulting in damaging and deadly debris flows. Photographs by USGS
Fig. 2
Fig. 2
Left: Photograph of a burned hillslope with rainfall, overland flow, and soil moisture monitoring instrumentation after the 2007 Canyon fire, but before rainfall, on the Pepperdine University Malibu, California campus. Right: Photograph from December 19, 2007, of a small post-fire debris flow and flood on the Pepperdine University campus following minor rainfall measured by USGS monitoring array to left. Photographs by USGS
Fig. 3
Fig. 3
Rainfall intensity (blue line) and soil moisture measurements at three depths below ground surface (black, green, and red lines at 5, 15, and 30 cm, respectively) from the Canyon Fire field monitoring site depicted in Fig. 2
Fig. 4
Fig. 4
Inputs, outputs, and steps of our soil moisture data analysis process; see Sect. 3 for further explanations
Fig. 5
Fig. 5
Single-sided amplitude spectrum of 5 cm soil moisture data from the Canyon Fire study area
Fig. 6
Fig. 6
Two different forecasts at t+τ, namely Mt+τ = M97, illustrating the difference between the regular and irregular approaches. For clarity, we are showing regular observed measurements (model inputs, blue circles) for both cases, even though they are only present for forecasting purposes in the regular case. Left: Forecast for M97 (model output, green rectangle) using regular measurement Mt = M94 0.213. Right: Forecast for M97 (model output, red rectangle) using an irregular “pseudo-measurement” M^t = M^94 0.185
Fig. 7
Fig. 7
Input hyperparameters of STL’s R implementation as used in experiments, see Sect. 5.2
Fig. 8
Fig. 8
Results from 30 cm soil moisture data for AEAR and NAR models trained by four different optimization algorithms. Comparisons are of forecast error (a, b) and runtime of training (c). The mean and variance of all performance metrics are computed over 30 independent runs
Fig. 9
Fig. 9
Soil moisture forecasts for 5 and 15 cm depths using the AEAR model, based on a one-point measurement of soil moisture at t=0. The points to the left of each of the blue dashed vertical lines are in the training set MT, and the points to the right are in the test set MP
Fig. 10
Fig. 10
Moisture forecasts for 30 cm depth using AEAR model. Points to left of blue dashed vertical line are in the training set MT and on the right in forecasting test set MP
Fig. 11
Fig. 11
Comparison of different model forecasts (SEM, NAR, AEAR) at 30 cm depth. The standard error in these forecasts is 0.037,0.022,and0.020, respectively. The vertical blue dashed line separates the training set MT (to the left) and forecasting set MP (to the right)
Fig. 12
Fig. 12
Forecast errors as a function of prediction horizon (τ), varying on the x-axis, for regular measurements. Top: Standard error is on the y-axis. Bottom: Maximum error is on the y-axis
Fig. 13
Fig. 13
Forecast error for a 30 cm probe as function of prediction horizon (τ), varying on the x-axis, for regular measurements. Top: MAPE on the y-axis. Bottom: Maximum error on the y-axis
Fig. 14
Fig. 14
Soil moisture forecasts at 30 cm depth using an LSTM(2) model in the regular setting. Points on the left of blue dashed vertical line are in the training set MT and on the right in the forecasting test set MP
Fig. 15
Fig. 15
Results for controlled experimental bucket data. Forecast errors for different models are on the y-axis, as a function of varying τ (in regular measurements) on the x-axis. Top: Standard error on the y-axis. Bottom: Maximum error on the y-axis
Fig. 16
Fig. 16
Partition of Canyon Fire soil moisture data from 30 cm deep probe measurements and forecasts of trained AEAR models for each period. To compare the model parameters for different periods, we partitioned the almost four months of the winter 2007–2008 season (December–March) into Period 1 (“December”), Period 2 (“January”), Period 3 (“February”), and Period 4 (“March”)
Fig. 17
Fig. 17
Variation in wetting and drying properties of soil in AEAR model parameters. We depict the variations over a winter season’s data, shown in Fig. 16, in both time—different wetting rates over “months,” as shown in (a) and space—drying rates at three different depths, as shown in (b). To observe the spatial variation, the model parameters for various soil depths below ground surface are shown in (b)

References

    1. Acosta-Mesa H-G, Rechy-Ramírez F, Mezura-Montes E, Cruz-Ramírez N, Jiménez RH. Application of time series discretization using evolutionary programming for classification of precancerous cervical lesions. J. Biomed. Inf. 2014;49:73–83. doi: 10.1016/j.jbi.2014.03.004. - DOI - PubMed
    1. Aljoumani B, Sànchez-Espigares JA, Cañameras N, Josa R, Monserrat J. Time series outlier and intervention analysis: irrigation management influences on soil water content in silty loam soil. Agric. Water Manag. 2012;111:105–114. doi: 10.1016/j.agwat.2012.05.008. - DOI
    1. Arlitt M, Jin T. A workload characterization study of the 1998 world cup web site. IEEE Netw. 2000;14(3):30–37. doi: 10.1109/65.844498. - DOI
    1. Basak, A., Mengshoel, O.J., Kulkarni, C., Schmidt, K., Shastry, P., Rapeta, R.: Optimizing the decomposition of time series using evolutionary algorithms: soil moisture analytics. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1073–1080. ACM (2017)
    1. Basak, A., Mengshoel, O.J., Schmidt, K., Kulkarni, C.: Wetting and drying of soil: from data to understandable models for prediction. In: IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA-12), pp. 303–312 (2018)

LinkOut - more resources