Nature Conservation Research: Заповедная наука (Feb 2024)

Is the GBIF appropriate for use as input in models of predicting species distributions? Study from the Czech Republic

  • Zuzana Štípková,
  • Spyros Tsiftsis,
  • Pavel Kindlmann

DOI
https://doi.org/10.24189/ncr.2024.008
Journal volume & issue
Vol. 9, no. 1
pp. 84 – 95

Abstract

Read online

Questions concerning species diversity have attracted ecologists and biogeographers for over a century, mainly because the diversity of life on Earth is in rapid decline, which is expected to continue in the future. One of the most important current database on species distribution data is the Global Biodiversity Information Facility (GBIF), which contains more than 2 billion occurrences for all organisms, and this number is continuously increasing with the addition of new data and by combining with other applications. Such data also exist in several national databases, most of which are unfortunately often not freely available and not included in GBIF. We suspected that the national databases, mostly professionally maintained by governmental organisations, may be more comprehensive than GBIF, which is not centrally organised and therefore the national databases may give more accurate predictions than GBIF. To test our assumptions, we have compared: (i) the amount of data included in the Czech database called Nálezová databáze ochrany přírody (NDOP, Discovery database of nature protection) with the amount of data in GBIF after its restriction to the Czech Republic, and (ii) the overlap of the predictions of species distributions for the Czech Republic, based on these two databases. We have used the family Orchidaceae as a model group. We found that: (i) there is a significantly larger number of records per studied region (Czech Republic) in NDOP, compared with GBIF, and (ii) the predictions of Maxent based on orchid records in NDOP are overlapping to a great degree with the predictions based on data based on orchid records in GBIF. Bearing in mind these results, we suggest that if only one database is available for the region studied, we must use this one. If more databases are available for the region studied, we should use the database containing most locations (usually some of the local ones, like NDOP), because using more locations implies larger significance of predictions of species distributions.

Keywords