Methods in Ecology and Evolution (Nov 2023)

USE it: Uniformly sampling pseudo‐absences within the environmental space for applications in habitat suitability models

  • Daniele Da Re,
  • Enrico Tordoni,
  • Jonathan Lenoir,
  • Jonas J. Lembrechts,
  • Sophie O. Vanwambeke,
  • Duccio Rocchini,
  • Manuele Bazzichetto

DOI
https://doi.org/10.1111/2041-210X.14209
Journal volume & issue
Vol. 14, no. 11
pp. 2873 – 2887

Abstract

Read online

Abstract Habitat suitability models infer the geographical distribution of species using occurrence data and environmental variables. While data on species presence are increasingly accessible, the difficulty of confirming real absences in the field often forces researchers to generate them in silico. To this aim, pseudo‐absences are commonly sampled randomly across the study area (i.e. the geographical space). However, this introduces sample location bias (i.e. the sampling is unbalanced towards the most frequent habitats occurring within the geographical space) and favours class overlap (i.e. overlap between environmental conditions associated with species presences and pseudo‐absences) in the training dataset. To mitigate this, we propose an alternative methodology (i.e. the uniform approach) that systematically samples pseudo‐absences within a portion of the environmental space delimited by a kernel‐based filter, which seeks to minimise the number of false absences included in the training dataset. We simulated 50 virtual species and modelled their distribution using training datasets assembled with the presence points of the virtual species and pseudo‐absences collected using the uniform approach and other approaches that randomly sample pseudo‐absences within the geographical space. We compared the predictive performance of habitat suitability models and evaluated the extent of sample location bias and class overlap associated with the different sampling strategies. Results indicated that the uniform approach: (i) effectively reduces sample location bias and class overlap; (ii) provides comparable predictive performance to sampling strategies carried out in the geographical space; and (iii) ensures gathering pseudo‐absences adequately representing the environmental conditions available across the study area. We developed a set of R functions in an accompanying R package called USE to disseminate the uniform approach.

Keywords