Frontiers in Marine Science (Oct 2021)

Matching Data Types to the Objectives of Species Distribution Modeling: An Evaluation With Marine Fish Species

  • Jing Luan,
  • Chongliang Zhang,
  • Yupeng Ji,
  • Binduo Xu,
  • Ying Xue,
  • Yiping Ren,
  • Yiping Ren,
  • Yiping Ren

DOI
https://doi.org/10.3389/fmars.2021.771071
Journal volume & issue
Vol. 8

Abstract

Read online

Species distribution model (SDM) is a crucial tool for forecasting ranges of species and mirroring habitat references and quality. Different types of species distribution data have been commonly used in SDMs regarding different purposes and availability, whereas, the influences of data types on model performances have not been well understood. This study considered three data types characterized by different levels of organism information and cost in data acquisitions, namely presence/absence (P/A), ordinal data, and abundance data. We developed a range of distribution models for nine demersal species in the coastal waters of Shandong Peninsula, China, using two modeling algorithms [the Generalized Additive Model (GAM) and Random Forest]. Firstly, we evaluated the performances of all models on predicting species occurrence (i.e., habitat suitability or range boundaries), and then compared the models built with ordinal data and abundance data on projecting ordinal predictions (i.e., relative density or habitat quality). Their predictive abilities were assessed through cross-validation tests with diverse performance measurements. Overall, no data type is superior in all situations, but combined with two algorithms, the abundance data slightly outperformed the ordinal data and P/A data unexpectedly exerted reliable performances. Specifically, the effectiveness of data type for two application purposes of SDMs substantially varied with modeling algorithms, revealing that GAMs always benefit most from ordinal data and the opposite was true for Random Forest. For some small resident organisms with moderate prevalence, rough distribution data might be adopted for providing reliable projections. Our findings highlight the importance of clarifying the objectives of SDMs when choosing data types for species distribution modeling.

Keywords