Parasite (Jan 2021)
Constraints of using historical data for modelling the spatial distribution of helminth parasites in ruminants
Abstract
Dicrocoelium dendriticum is a trematode that infects ruminant livestock and requires two different intermediate hosts to complete its lifecycle. Modelling the spatial distribution of this parasite can help to improve its management in higher risk regions. The aim of this research was to assess the constraints of using historical data sets when modelling the spatial distribution of helminth parasites in ruminants. A parasitological data set provided by CREMOPAR (Napoli, Italy) and covering most of Italy was used in this paper. A baseline model (Random Forest, VECMAP®) using the entire data set was first used to determine the minimal number of data points needed to build a stable model. Then, annual distribution models were computed and compared with the baseline model. The best prediction rate and statistical output were obtained for 2012 and the worst for 2016, even though the sample size of the former was significantly smaller than the latter. We discuss how this may be explained by the fact that in 2012, the samples were more evenly geographically distributed, whilst in 2016 most of the data were strongly clustered. It is concluded that the spatial distribution of the input data appears to be more important than the actual sample size when computing species distribution models. This is often a major issue when using historical data to develop spatial models. Such data sets often include sampling biases and large geographical gaps. If this bias is not corrected, the spatial distribution model outputs may display the sampling effort rather than the real species distribution.
Keywords