Ecological Informatics (Dec 2024)

Leveraging social media and community science data for environmental niche models: A case study with native Australian bees

  • Robert A. Moore,
  • Matthew R.E. Symonds,
  • Scarlett R. Howard

Journal volume & issue
Vol. 84
p. 102857

Abstract

Read online

Museum occurrence records are popular sources of information for creating Environmental Niche Models (ENMs), which allow the mapping of the potential niche ranges of species. Occurrence data is often downloaded en masse from established databases. However, the use of non-traditional data sources, such as occurrence records from community/citizen science outreach and social media, is increasing in use and abundance. Data from non-traditional data sources are potentially valuable records of information, particularly for species where museum occurrence records may be comparatively scarce. In the current study, we aimed to determine the impact of adding occurrence data from non-traditional databases to ENMs that were originally created using traditional databases with a group of comparatively understudied species, native Australian bees. We used the Maxent algorithm to model the potential environmental niches of eight species. We created three models for each species: 1) one consisting of only location data from museum specimen collection records from the Atlas of Living Australia (ALA) (a traditional database), 2) one combining ALA and geo-tagged social media (Flickr) data, and 3) a model combining ALA and geo-tagged community science data from iNaturalist. This resulted in 24 different models. By comparing the models produced from each of the augmented data sets with the traditional species data set (ALA vs. ALA & Flickr; ALA vs. ALA & iNaturalist) we showed that there were significant differences, not only in predicted ranges, but also in the weighting of environmental variables used by the models to predict the environmental niche. Differences were more greatly influenced by the geographic location of the extra occurrences rather than the number of additional occurrence points. We demonstrate the potential value and risks of including social media and community science geo-tagged image data in supplementing knowledge of species distributions, particularly for relatively under-sampled species such as native bees.

Keywords