. 2021 Jun 1;16(6):e0252448.

doi: 10.1371/journal.pone.0252448. eCollection 2021.

Predicting direct and indirect non-target impacts of biocontrol agents using machine-learning approaches

Hannah J Kotula¹, Guadalupe Peralta¹, Carol M Frost², Jacqui H Todd³, Jason M Tylianakis^{1

4}

Affiliations

¹ Centre for Integrative Ecology, School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.
² Department of Renewable Resources, University of Alberta, Edmonton, Canada.
³ The New Zealand Institute for Plant and Food Research Limited, Auckland, New Zealand.
⁴ Bio-Protection Research Centre, School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.

PMID: 34061885
PMCID: PMC8168882
DOI: 10.1371/journal.pone.0252448

Predicting direct and indirect non-target impacts of biocontrol agents using machine-learning approaches

Hannah J Kotula et al. PLoS One. 2021.

. 2021 Jun 1;16(6):e0252448.

doi: 10.1371/journal.pone.0252448. eCollection 2021.

Authors

Hannah J Kotula¹, Guadalupe Peralta¹, Carol M Frost², Jacqui H Todd³, Jason M Tylianakis^{1

4}

Affiliations

¹ Centre for Integrative Ecology, School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.
² Department of Renewable Resources, University of Alberta, Edmonton, Canada.
³ The New Zealand Institute for Plant and Food Research Limited, Auckland, New Zealand.
⁴ Bio-Protection Research Centre, School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.

PMID: 34061885
PMCID: PMC8168882
DOI: 10.1371/journal.pone.0252448

Erratum in

Correction: Predicting direct and indirect non-target impacts of biocontrol agents using machine-learning approaches.
Kotula HJ, Peralta G, Frost CM, Todd JH, Tylianakis JM. Kotula HJ, et al. PLoS One. 2021 Sep 29;16(9):e0258080. doi: 10.1371/journal.pone.0258080. eCollection 2021. PLoS One. 2021. PMID: 34587224 Free PMC article.

Abstract

Biological pest control (i.e. 'biocontrol') agents can have direct and indirect non-target impacts, and predicting these effects (especially indirect impacts) remains a central challenge in biocontrol risk assessment. The analysis of ecological networks offers a promising approach to understanding the community-wide impacts of biocontrol agents (via direct and indirect interactions). Independently, species traits and phylogenies have been shown to successfully predict species interactions and network structure (alleviating the need to collect quantitative interaction data), but whether these approaches can be combined to predict indirect impacts of natural enemies remains untested. Whether predictions of interactions (i.e. direct effects) can be made equally well for generalists vs. specialists, abundant vs. less abundant species, and across different habitat types is also untested for consumer-prey interactions. Here, we used two machine-learning techniques (random forest and k-nearest neighbour; KNN) to test whether we could accurately predict empirically-observed quantitative host-parasitoid networks using trait and phylogenetic information. Then, we tested whether the accuracy of machine-learning-predicted interactions depended on the generality or abundance of the interacting partners, or on the source (habitat type) of the training data. Finally, we used these predicted networks to generate predictions of indirect effects via shared natural enemies (i.e. apparent competition), and tested these predictions against empirically observed indirect effects between hosts. We found that random-forest models predicted host-parasitoid pairwise interactions (which could be used to predict attack of non-target host species) more successfully than KNN. This predictive ability depended on the generality of the interacting partners for KNN models, and depended on species' abundances for both random-forest and KNN models, but did not depend on the source (habitat type) of data used to train the models. Further, although our machine-learning informed methods could significantly predict indirect effects, the explanatory power of our machine-learning models for indirect interactions was reasonably low. Combining machine-learning and network approaches provides a starting point for reducing risk in biocontrol introductions, and could be applied more generally to predicting species interactions such as impacts of invasive species.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist. We acknowledge that one of the authors (J.H.T) is employed by a commercial company (The New Zealand Institute for Plant and Food Research Limited). This does not alter our adherence to PLOS ONE policies on sharing data and materials.

Figures

**Fig 1. Summary of methods.**
(a) Lepidopteran larvae (‘hosts’) were collected from eight native forest sites and eight plantation forest sites (we call these ‘training sites’ as data from these sites were used to train the models). (b) Hosts were identified and reared to determine parasitism rates and identities. We created three quantitative training meta-networks (‘native’, ‘plantation’, ‘combined’) by pooling interaction data across native sites, plantation sites, and all sites, respectively. (c) We used machine-learning models (random forest and KNN), informed by training meta-networks, traits and phylogenies, to predict interaction frequencies at test sites (16 native forest and 16 plantation forest) at time t. We created a separate model for each of the three training meta-networks, for both machine-learning approaches (i.e., 6 models: random forest/KNN ‘native’/’plantation’/’combined’). (d, e) We then tested whether the ability of these machine-learning models to predict interaction frequencies depended on species generality or abundance, and whether predictive ability depended on the data (i.e. native, plantation, combined) used to inform the model. (f) We then calculated the potential for apparent competition, d_ij, between each pair of host species in the training data (using both native and plantation sites data). (g) We calculated expected parasitism rates (E) for host species at time = *t+1* test sites using these d_ij values, along with predicted attack rates (α) calculated from our machine-learning predicted networks at time = t test sites, and abundance (n) from the test sites. We calculated expected parasitism rates (E) for both random forest and KNN. Additionally, we calculated expected parasitism rates using a data-based approach used by Frost *et al*. [21]. We tested whether expected parasitism rates could significantly predict observed parasitism rates at time = *t+1* test sites (h, i).

**Fig 2. Predictions of observed interaction frequency.**
(A) Random-forest-predicted probability significantly predicted observed interaction frequency for all three random-forest models (‘combined’, ‘native’, ‘plantation’), whereas (B) KNN-predicted frequency was not retained in any of the best-fitting models (tested using Poisson linear mixed models, and dashed lines represent non-significant relationships). Deviance residuals of the best-fitting model excluding the predictor on the x-axis are plotted, to better reflect the partial effect of the predictor. Each point represents a host-parasitoid species pair present within a site at the first-time step (t) test sites.

Fig 3. Overall, machine-learning predictions of observed interaction frequency varied with the generality (measured as normalised degree; ND) of the interacting partners for KNN, but not random-forest models, and depended on the abundance of the interacting partners for both models.

**Fig 4. Predictions of observed parasitism rate.**
(A) Random-forest- and (B) KNN- expected parasitism rate both significantly predicted observed parasitism rate. Residuals of the best-fitting model excluding the predictor on the x axis are plotted. Each point represents a host species within a site in the second-time step (*t+1*) test sites. Three and four outliers, respectively, were removed from (A) and (B), though the results with the outliers included were qualitatively the same in both cases.

See this image and copyright information in PMC

Cited by

Correction: Predicting direct and indirect non-target impacts of biocontrol agents using machine-learning approaches.
Kotula HJ, Peralta G, Frost CM, Todd JH, Tylianakis JM. Kotula HJ, et al. PLoS One. 2021 Sep 29;16(9):e0258080. doi: 10.1371/journal.pone.0258080. eCollection 2021. PLoS One. 2021. PMID: 34587224 Free PMC article.
Groundbreaking Technologies and the Biocontrol of Fungal Vascular Plant Pathogens.
Gómez-Lama Cabanás C, Mercado-Blanco J. Gómez-Lama Cabanás C, et al. J Fungi (Basel). 2025 Jan 18;11(1):77. doi: 10.3390/jof11010077. J Fungi (Basel). 2025. PMID: 39852495 Free PMC article. Review.

References

1. Pimentel D, Acquay H, Biltonen M, Rice P, Silva M, Nelson J, et al.. Environmental and economic costs of pesticide use. Bioscience. 1992;42(10):750–60.
1. Tilman D, Cassman KG, Matson PA, Naylor R, Polasky S. Agricultural sustainability and intensive production practices. Nature. 2002;418:671–7. 10.1038/nature01014 - DOI - PubMed
1. Palumbi SR. Humans as the world’s greatest evolutionary force. Science. 2001;293(5536):1786–90. 10.1126/science.293.5536.1786 - DOI - PubMed
1. Brattsten LB, Holyoke CW, Leeper JR, Raffa KF. Insecticide resistance—challenge to pest-management and basic research. Science. 1986;231(4743):1255–60. 10.1126/science.231.4743.1255 - DOI - PubMed
1. Denholm I, Rowland MW. Tactics for managing pesticide resistance in arthropods: theory and practice. Annual Review of Ecology and Systematics. 1992;37:91–112. 10.1146/annurev.en.37.010192.000515 - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Associated data

Dryad/10.5061/dryad.t5557

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Predicting direct and indirect non-target impacts of biocontrol agents using machine-learning approaches

Affiliations

Predicting direct and indirect non-target impacts of biocontrol agents using machine-learning approaches

Authors

Affiliations

Erratum in

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Associated data

LinkOut - more resources

Full Text Sources

Erratum in

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Associated data

Related information

LinkOut - more resources

Full Text Sources