PLoS ONE (Jan 2021)
Predicting direct and indirect non-target impacts of biocontrol agents using machine-learning approaches.
Abstract
Biological pest control (i.e. 'biocontrol') agents can have direct and indirect non-target impacts, and predicting these effects (especially indirect impacts) remains a central challenge in biocontrol risk assessment. The analysis of ecological networks offers a promising approach to understanding the community-wide impacts of biocontrol agents (via direct and indirect interactions). Independently, species traits and phylogenies have been shown to successfully predict species interactions and network structure (alleviating the need to collect quantitative interaction data), but whether these approaches can be combined to predict indirect impacts of natural enemies remains untested. Whether predictions of interactions (i.e. direct effects) can be made equally well for generalists vs. specialists, abundant vs. less abundant species, and across different habitat types is also untested for consumer-prey interactions. Here, we used two machine-learning techniques (random forest and k-nearest neighbour; KNN) to test whether we could accurately predict empirically-observed quantitative host-parasitoid networks using trait and phylogenetic information. Then, we tested whether the accuracy of machine-learning-predicted interactions depended on the generality or abundance of the interacting partners, or on the source (habitat type) of the training data. Finally, we used these predicted networks to generate predictions of indirect effects via shared natural enemies (i.e. apparent competition), and tested these predictions against empirically observed indirect effects between hosts. We found that random-forest models predicted host-parasitoid pairwise interactions (which could be used to predict attack of non-target host species) more successfully than KNN. This predictive ability depended on the generality of the interacting partners for KNN models, and depended on species' abundances for both random-forest and KNN models, but did not depend on the source (habitat type) of data used to train the models. Further, although our machine-learning informed methods could significantly predict indirect effects, the explanatory power of our machine-learning models for indirect interactions was reasonably low. Combining machine-learning and network approaches provides a starting point for reducing risk in biocontrol introductions, and could be applied more generally to predicting species interactions such as impacts of invasive species.