Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 14;12(5):250012.
doi: 10.1098/rsos.250012. eCollection 2025 May.

Where to refine spatial data to improve accuracy in crop disease modelling: an analytical approach with examples for cassava

Affiliations

Where to refine spatial data to improve accuracy in crop disease modelling: an analytical approach with examples for cassava

Yevhen F Suprunenko et al. R Soc Open Sci. .

Abstract

Epidemiological modelling plays an important role in global food security by informing strategies for the control and management of invasion and spread of crop diseases. However, the underlying data on spatial locations of host crops that are susceptible to a pathogen are often incomplete and inaccurate, thus reducing the accuracy of model predictions. Obtaining and refining datasets that fully represent a host landscape across territories can be a major challenge when predicting disease outbreaks. Therefore, it would be an advantage to prioritize areas in which data refinement efforts should be directed to improve the accuracy of epidemic prediction. In this paper, we present an analytical method to identify areas where potential errors in mapped host data would have the largest impact on modelled pathogen invasion and short-term spread. The method is based on an analytical approximation for the rate at which susceptible host crops become infected at the start of an epidemic. We show how implementing spatial prioritization for data refinement in a cassava-growing region in sub-Saharan Africa could be an effective means for improving accuracy when modelling the dispersal and spread of the crop pathogen cassava brown streak virus.

Keywords: analytical approximation; crop landscape; epidemic invasion; epidemiological model; infection rate; spatially explicit individual-based model.

PubMed Disclaimer

Conflict of interest statement

We declare we have no competing interests.

Figures

Host landscape. Data (as raster with spatial resolution 1 km by 1 km).
Figure 1.
Host landscape. Data (as raster with spatial resolution 1 km by 1 km) on spatial distribution of cassava production extracted from CassavaMap [7] and converted to fields [5] (see §2); here, we show the entire 336 km-by-336 km landscape H (available from Figshare [24]). Raster cells with zero values and those with no data are both displayed in white: grey values (from the raster) indicate low values. We resolved the 336 km-by-336 km landscape into a square lattice with a mesh size of 24 km. For convenience, the 24 km-by-24 km area outlined by the blue boundary is used in figure 2 to illustrate the analyses that were conducted on the entire 336 km-by-336 km landscape.
Host data and potential errors.
Figure 2.
Host data and potential errors. (a) The landscape H represents the available data extracted from CassavaMap (figure 1). The small sample area shown here is the same area outlined by the blue boundary in figure 1. A black arrow in H denotes the location of a single initially infected cassava field (the same for all landscapes). (b–d) Different scenarios from H to account for potential errors in the mapped cassava landscape, illustrated on a 2 km-by-2 km area shown above each landscape (see also §2). Each tile represents a fixed number of fields, and the red tiles in (d) represent +20% in addition to two tiles underneath. In all panels, zero values are displayed in white.
Impact of potential errors in host data on results of epidemic modelling.
Figure 3.
Impact of potential errors in host data on results of epidemic modelling. (a) Dispersal kernel b(x) of a pathogen (CBSV), sampled from results of parameter estimation by Godding et al. [5] (see §2). (b) Mean number I(t) of infected fields of cassava at t=6 months after the start of an epidemic obtained from 1000 computer simulations for each landscape. (c) Mean, median and percentiles of results of computer simulations shown together with estimates of infection rate r from equation (2.1) multiplied by the total number N of cassava fields estimated for corresponding 24 km-by-24 km areas from figure 1.
Finding areas where spatial host data should be refined to improve accuracy in crop disease modelling.
Figure 4.
Finding areas where spatial host data should be refined to improve accuracy in crop disease modelling. (a) Presume that the alternative landscape, H+20%, is a realistic surrogate for the actual landscape (cf. figure 1d), i.e. H+20% has +20% more cassava fields than landscape H extracted from available data (see §2 for details). (b) Spatial prioritization for data refinement: in each 24 km-by-24 km cell within the entire landscape, we identified five 2 km-by-2 km cells with the highest impact on epidemic spread and denoted them as ‘top priority areas’, outlined by the red boundary. A small sample area shown here is the same area outlined by the blue boundary in figure 1. (c) The four landscapes in which no fields are added (H); fields are added inside (H+20% in) or outside top priority areas (H+20% out) or across the domain (H+20%), subject to a maximum increase of 20% over the default map. (d) Mean number of infected fields obtained from 1000 computer simulations. (e) Numerical values for mean, median and percentiles of infected fields, together with estimates of quantities rN for each landscape. (f) The effect of spatial prioritization and subsequent data refinement within identified top priority areas on the accuracy of epidemic model predictions (see §3 for details). The computer code and data, including entire landscapes and the map of top priority areas used in this work, are available from Figshare [24,27].

Similar articles

References

    1. Cunniffe NJ, Koskella B, E. Metcalf CJ, Parnell S, Gottwald TR, Gilligan CA. 2015. Thirteen challenges in modelling plant diseases. Epidemics 10, 6–10. (10.1016/j.epidem.2014.06.002) - DOI - PubMed
    1. Meyer M, et al. . 2021. Wheat rust epidemics damage Ethiopian wheat production: a decade of field disease surveillance reveals national-scale trends in past outbreaks. PLoS ONE 16, e0245697. (10.1371/journal.pone.0245697) - DOI - PMC - PubMed
    1. Bradshaw CD, et al. . 2022. Irrigation can create new green bridges that promote rapid intercontinental spread of the wheat stem rust pathogen. Environ. Res. Lett. 17, 114025. (10.1088/1748-9326/ac9ac7) - DOI
    1. Blasch G, et al. . 2024. Ethiopian Crop Type 2020 (EthCT2020) dataset: crop type data for environmental and agricultural remote sensing applications in complex Ethiopian smallholder wheat-based farming systems (Meher season 2020/21). Data Brief 54, 110427. (10.1016/j.dib.2024.110427) - DOI - PMC - PubMed
    1. Godding D, Stutt ROJH, Alicai T, Abidrabo P, Okao-Okuja G, Gilligan CA. 2023. Developing a predictive model for an emerging epidemic on cassava in sub-Saharan Africa. Sci. Rep. 13, 12603. (10.1038/s41598-023-38819-x) - DOI - PMC - PubMed

LinkOut - more resources