Ecological Indicators (Apr 2022)
Machine learning-based prediction for grassland degradation using geographic, meteorological, plant and microbial data
Abstract
Extensive grassland degradation under climate change and intensified human activities has threatened ecological security and caused a variety of environmental problems. However, it is still challenging to predict the grassland degradation status on a large scale because it is a multi-factorial phenomenon with complex changes in ecosystem structure and function, which is hard to be fully characterized through mechanism models. The emergence of machine learning algorithms provides a potential to model complex systems and mine information from multi-source data without elucidating underlying mechanisms. Here, we utilized random forest and neural network algorithms to predict the grassland degradation represented by the net primary productivity (NPP) changing rate based on multi-source data including geographic, meteorological, plant traits, land use type and microbial variables in the Chinese Northern grassland. Particularly, the microbial roles in determining the degradation status were concerned. Results show that a high prediction precision was achieved by random forest model, rather than by neural network model, with a mean relative error of 16.9% and a mean square error of 9.273e-05. Besides identified longitude, arid index and current NPP state, specific soil microbial groups, mainly Solirubrobacter, were screened as credible biomarkers. Regarding model fitting, geographic, meteorological and plant variables explained 61.8% of the total variance, which was enhanced up to 72.8% by the rest microbial markers. These findings provide a theoretical basis to establish a pre-warning system for grassland management and policy-making.