PLoS ONE (Jan 2022)
Automatic variable selection in ecological niche modeling: A case study using Cassin’s Sparrow (Peucaea cassinii)
Abstract
MERRA/Max provides a feature selection approach to dimensionality reduction that enables direct use of global climate model outputs in ecological niche modeling. The system accomplishes this reduction through a Monte Carlo optimization in which many independent MaxEnt runs, operating on a species occurrence file and a small set of randomly selected variables in a large collection of variables, converge on an estimate of the top contributing predictors in the larger collection. These top predictors can be viewed as potential candidates in the variable selection step of the ecological niche modeling process. MERRA/Max’s Monte Carlo algorithm operates on files stored in the underlying filesystem, making it scalable to large data sets. Its software components can run as parallel processes in a high-performance cloud computing environment to yield near real-time performance. In tests using Cassin’s Sparrow (Peucaea cassinii) as the target species, MERRA/Max selected a set of predictors from Worldclim’s Bioclim collection of 19 environmental variables that have been shown to be important determinants of the species’ bioclimatic niche. It also selected biologically and ecologically plausible predictors from a more diverse set of 86 environmental variables derived from NASA’s Modern-Era Retrospective Analysis for Research and Applications Version 2 (MERRA-2) reanalysis, an output product of the Goddard Earth Observing System Version 5 (GEOS-5) modeling system. We believe these results point to a technological approach that could expand the use global climate model outputs in ecological niche modeling, foster exploratory experimentation with otherwise difficult-to-use climate data sets, streamline the modeling process, and, eventually, enable automated bioclimatic modeling as a practical, readily accessible, low-cost, commercial cloud service.