Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jun 1;16(6):e0252448.
doi: 10.1371/journal.pone.0252448. eCollection 2021.

Predicting direct and indirect non-target impacts of biocontrol agents using machine-learning approaches

Affiliations

Predicting direct and indirect non-target impacts of biocontrol agents using machine-learning approaches

Hannah J Kotula et al. PLoS One. .

Erratum in

Abstract

Biological pest control (i.e. 'biocontrol') agents can have direct and indirect non-target impacts, and predicting these effects (especially indirect impacts) remains a central challenge in biocontrol risk assessment. The analysis of ecological networks offers a promising approach to understanding the community-wide impacts of biocontrol agents (via direct and indirect interactions). Independently, species traits and phylogenies have been shown to successfully predict species interactions and network structure (alleviating the need to collect quantitative interaction data), but whether these approaches can be combined to predict indirect impacts of natural enemies remains untested. Whether predictions of interactions (i.e. direct effects) can be made equally well for generalists vs. specialists, abundant vs. less abundant species, and across different habitat types is also untested for consumer-prey interactions. Here, we used two machine-learning techniques (random forest and k-nearest neighbour; KNN) to test whether we could accurately predict empirically-observed quantitative host-parasitoid networks using trait and phylogenetic information. Then, we tested whether the accuracy of machine-learning-predicted interactions depended on the generality or abundance of the interacting partners, or on the source (habitat type) of the training data. Finally, we used these predicted networks to generate predictions of indirect effects via shared natural enemies (i.e. apparent competition), and tested these predictions against empirically observed indirect effects between hosts. We found that random-forest models predicted host-parasitoid pairwise interactions (which could be used to predict attack of non-target host species) more successfully than KNN. This predictive ability depended on the generality of the interacting partners for KNN models, and depended on species' abundances for both random-forest and KNN models, but did not depend on the source (habitat type) of data used to train the models. Further, although our machine-learning informed methods could significantly predict indirect effects, the explanatory power of our machine-learning models for indirect interactions was reasonably low. Combining machine-learning and network approaches provides a starting point for reducing risk in biocontrol introductions, and could be applied more generally to predicting species interactions such as impacts of invasive species.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist. We acknowledge that one of the authors (J.H.T) is employed by a commercial company (The New Zealand Institute for Plant and Food Research Limited). This does not alter our adherence to PLOS ONE policies on sharing data and materials.

Figures

Fig 1
Fig 1. Summary of methods.
(a) Lepidopteran larvae (‘hosts’) were collected from eight native forest sites and eight plantation forest sites (we call these ‘training sites’ as data from these sites were used to train the models). (b) Hosts were identified and reared to determine parasitism rates and identities. We created three quantitative training meta-networks (‘native’, ‘plantation’, ‘combined’) by pooling interaction data across native sites, plantation sites, and all sites, respectively. (c) We used machine-learning models (random forest and KNN), informed by training meta-networks, traits and phylogenies, to predict interaction frequencies at test sites (16 native forest and 16 plantation forest) at time t. We created a separate model for each of the three training meta-networks, for both machine-learning approaches (i.e., 6 models: random forest/KNN ‘native’/’plantation’/’combined’). (d, e) We then tested whether the ability of these machine-learning models to predict interaction frequencies depended on species generality or abundance, and whether predictive ability depended on the data (i.e. native, plantation, combined) used to inform the model. (f) We then calculated the potential for apparent competition, dij, between each pair of host species in the training data (using both native and plantation sites data). (g) We calculated expected parasitism rates (E) for host species at time = t+1 test sites using these dij values, along with predicted attack rates (α) calculated from our machine-learning predicted networks at time = t test sites, and abundance (n) from the test sites. We calculated expected parasitism rates (E) for both random forest and KNN. Additionally, we calculated expected parasitism rates using a data-based approach used by Frost et al. [21]. We tested whether expected parasitism rates could significantly predict observed parasitism rates at time = t+1 test sites (h, i).
Fig 2
Fig 2. Predictions of observed interaction frequency.
(A) Random-forest-predicted probability significantly predicted observed interaction frequency for all three random-forest models (‘combined’, ‘native’, ‘plantation’), whereas (B) KNN-predicted frequency was not retained in any of the best-fitting models (tested using Poisson linear mixed models, and dashed lines represent non-significant relationships). Deviance residuals of the best-fitting model excluding the predictor on the x-axis are plotted, to better reflect the partial effect of the predictor. Each point represents a host-parasitoid species pair present within a site at the first-time step (t) test sites.
Fig 3
Fig 3. Overall, machine-learning predictions of observed interaction frequency varied with the generality (measured as normalised degree; ND) of the interacting partners for KNN, but not random-forest models, and depended on the abundance of the interacting partners for both models.
Fig 4
Fig 4. Predictions of observed parasitism rate.
(A) Random-forest- and (B) KNN- expected parasitism rate both significantly predicted observed parasitism rate. Residuals of the best-fitting model excluding the predictor on the x axis are plotted. Each point represents a host species within a site in the second-time step (t+1) test sites. Three and four outliers, respectively, were removed from (A) and (B), though the results with the outliers included were qualitatively the same in both cases.

Similar articles

Cited by

References

    1. Pimentel D, Acquay H, Biltonen M, Rice P, Silva M, Nelson J, et al.. Environmental and economic costs of pesticide use. Bioscience. 1992;42(10):750–60.
    1. Tilman D, Cassman KG, Matson PA, Naylor R, Polasky S. Agricultural sustainability and intensive production practices. Nature. 2002;418:671–7. 10.1038/nature01014 - DOI - PubMed
    1. Palumbi SR. Humans as the world’s greatest evolutionary force. Science. 2001;293(5536):1786–90. 10.1126/science.293.5536.1786 - DOI - PubMed
    1. Brattsten LB, Holyoke CW, Leeper JR, Raffa KF. Insecticide resistance—challenge to pest-management and basic research. Science. 1986;231(4743):1255–60. 10.1126/science.231.4743.1255 - DOI - PubMed
    1. Denholm I, Rowland MW. Tactics for managing pesticide resistance in arthropods: theory and practice. Annual Review of Ecology and Systematics. 1992;37:91–112. 10.1146/annurev.en.37.010192.000515 - DOI - PubMed

Publication types