Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun 2;22(1):12.
doi: 10.1186/s12942-023-00331-w.

Open-source environmental data as an alternative to snail surveys to assess schistosomiasis risk in areas approaching elimination

Affiliations

Open-source environmental data as an alternative to snail surveys to assess schistosomiasis risk in areas approaching elimination

Elise N Grover et al. Int J Health Geogr. .

Abstract

Background: Although the presence of intermediate snails is a necessary condition for local schistosomiasis transmission to occur, using them as surveillance targets in areas approaching elimination is challenging because the patchy and dynamic quality of snail host habitats makes collecting and testing snails labor-intensive. Meanwhile, geospatial analyses that rely on remotely sensed data are becoming popular tools for identifying environmental conditions that contribute to pathogen emergence and persistence.

Methods: In this study, we assessed whether open-source environmental data can be used to predict the presence of human Schistosoma japonicum infections among households with a similar or improved degree of accuracy compared to prediction models developed using data from comprehensive snail surveys. To do this, we used infection data collected from rural communities in Southwestern China in 2016 to develop and compare the predictive performance of two Random Forest machine learning models: one built using snail survey data, and one using open-source environmental data.

Results: The environmental data models outperformed the snail data models in predicting household S. japonicum infection with an estimated accuracy and Cohen's kappa value of 0.89 and 0.49, respectively, in the environmental model, compared to an accuracy and kappa of 0.86 and 0.37 for the snail model. The Normalized Difference in Water Index (an indicator of surface water presence) within half to one kilometer of the home and the distance from the home to the nearest road were among the top performing predictors in our final model. Homes were more likely to have infected residents if they were further from roads, or nearer to waterways.

Conclusion: Our results suggest that in low-transmission environments, leveraging open-source environmental data can yield more accurate identification of pockets of human infection than using snail surveys. Furthermore, the variable importance measures from our models point to aspects of the local environment that may indicate increased risk of schistosomiasis. For example, households were more likely to have infected residents if they were further from roads or were surrounded by more surface water, highlighting areas to target in future surveillance and control efforts.

Keywords: China; Geographic information systems; Infectious disease surveillance; Machine learning; Oncomelania hupensis; Prevention and control; Remote sensing technology; Schistosomiasis; Snails.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Depiction of household inclusion and exclusion
Fig. 2
Fig. 2
Maps of the study villages
Fig. 3
Fig. 3
Receiver Operating Characteristics (ROC) Area Under the Curve (AUC) for snail and environmental models
Fig. 4
Fig. 4
Variable importance plots for the snail and environmental data models. For each of the three models generated with the snail data and the environmental data, variable importance was determined using Mean Decrease in Accuracy (MDA). Each variable is assigned one color across all three models such that color can be used to highlight major shifts in variable importance ranks between models
Fig. 5
Fig. 5
Prediction map showing the probability of S. japonicum infection using the top-performing environmental data model. The final top performing model was defined as the one with the highest kappa, accuracy, and receiver operating characteristic (ROC) area under the curve (AUC), respectively. Model performance metrics (Cohen’s kappa and accuracy) highlighted that the open-source environmental data models outperformed the snail data models. The top performing environmental data model was used to create a prediction surface of the probability of S. japonicum infection across the entire study area

Update of

References

    1. World Health Organization . Ending the neglect to attain the sustainable development goals: a road map for neglected tropical diseases 2021–2030. Geneva: World Health Organization; 2020.
    1. Colley DG, Bustinduy AL, Secor WE, King CH. Human schistosomiasis. Lancet. 2014;383(9936):2253–2264. doi: 10.1016/S0140-6736(13)61949-2. - DOI - PMC - PubMed
    1. Ross AG, Vickers D, Olds GR, Shah SM, McManus DP. Katayama syndrome. Lancet Infect Dis. 2007;7(3):218–224. doi: 10.1016/S1473-3099(07)70053-1. - DOI - PubMed
    1. Carlton EJ, Bates MN, Zhong B, Seto EYW, Spear RC. Evaluation of mammalian and intermediate host surveillance methods for detecting schistosomiasis reemergence in Southwest China. PLoS Negl Trop Dis. 2011;5(3):e987. doi: 10.1371/journal.pntd.0000987. - DOI - PMC - PubMed
    1. Liang S, Yang C, Zhong B, Guo J, Li H, Carlton EJ, et al. Surveillance systems for neglected tropical diseases: global lessons from China’s evolving schistosomiasis reporting systems, 1949–2014. Emerg Themes Epidemiol. 2014 doi: 10.1186/1742-7622-11-19. - DOI - PMC - PubMed

Publication types