Scientific Reports (Jun 2024)

Global soil respiration estimation based on ecological big data and machine learning model

  • Jiangnan Liu,
  • Junguo Hu,
  • Haoqi Liu,
  • Kanglai Han

DOI
https://doi.org/10.1038/s41598-024-64235-w
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 17

Abstract

Read online

Abstract Soil respiration (Rs) represents the greatest carbon dioxide flux from terrestrial ecosystems to the atmosphere. However, its environmental drivers are not fully understood, and there are still significant uncertainties in soil respiration model estimates. This study aimed to estimate the spatial distribution pattern and driving mechanism of global soil respiration by constructing a machine learning model method based on ecological big data. First, we constructed ecological big data containing five categories of 27-dimensional environmental factors. We then used four typical machine learning methods to develop the performance of machine learning models under four training strategies and explored the relationship between soil respiration and environmental factors. Finally, we used the RF machine learning algorithm to estimate the global Rs spatial distribution pattern in 2021, driven by multiple dimensions of environmental factors, and derived the annual soil respiration values. The results showed that RF performed better under the four training strategies, with a coefficient of determination R2 = 0.78216, root mean squared error (RMSE) = 285.8964 gCm−2y−1, and mean absolute error (MAE) = 180.4186 gCm−2y−1, which was more suitable for the estimation of large-scale soil respiration. In terms of the importance of environmental factors, unlike previous studies, we found that the influence of geographical location was greater than that of MAP. Another new finding was that enhanced vegetation index 2 (EVI2) had a higher contribution to soil respiration estimates than the enhanced vegetation index (EVI) and normalized vegetation index (NDVI). Our results confirm the potential of utilizing ecological big data for spatially large-scale Rs estimations. Ecological big data and machine learning algorithms can be considered to improve the spatial distribution patterns and driver analysis of Rs.

Keywords