Improving the Spatial Prediction of Soil Organic Carbon Content in Two Contrasting Climatic Regions by Stacking Machine Learning Models and Rescanning Covariate Space

Ruhollah Taghizadeh-Mehrjardi; Karsten Schmidt; Alireza Amirian-Chakan; Tobias Rentschler; Mojtaba Zeraatpisheh; Fereydoon Sarmadian; Roozbeh Valavi; Naser Davatgar; Thorsten Behrens; Thomas Scholten

doi:10.3390/rs12071095

Remote Sensing (Mar 2020)

Improving the Spatial Prediction of Soil Organic Carbon Content in Two Contrasting Climatic Regions by Stacking Machine Learning Models and Rescanning Covariate Space

Ruhollah Taghizadeh-Mehrjardi,
Karsten Schmidt,
Alireza Amirian-Chakan,
Tobias Rentschler,
Mojtaba Zeraatpisheh,
Fereydoon Sarmadian,
Roozbeh Valavi,
Naser Davatgar,
Thorsten Behrens,
Thomas Scholten

Affiliations

Ruhollah Taghizadeh-Mehrjardi: Department of Geosciences, Soil Science and Geomorphology, University of Tübingen, 72070 Tübingen, Germany
Karsten Schmidt: eScience Center, University of Tübingen, 72070 Tübingen, Germany
Alireza Amirian-Chakan: Department of Soil Science, Lorestan University, Khorramabad 6815144316, Iran
Tobias Rentschler: Department of Geosciences, Soil Science and Geomorphology, University of Tübingen, 72070 Tübingen, Germany
Mojtaba Zeraatpisheh: Key Laboratory of Geospatial Technology for the Middle and Lower Yellow River Regions, College of Environment and Planning, Henan University, Kaifeng 475004, China
Fereydoon Sarmadian: Department of Soil Science, College of Agriculture, University of Tehran, Karaj 77871-31587, Iran
Roozbeh Valavi: The Quantitative & Applied Ecology Group, School of BioSciences, The University of Melbourne, Victoria 3010, Australia
Naser Davatgar: Soil & Water Research Institute, Agricultural Research, Education and Extension Organization, Karaj 3177993545, Iran
Thorsten Behrens: Department of Geosciences, Soil Science and Geomorphology, University of Tübingen, 72070 Tübingen, Germany
Thomas Scholten: Department of Geosciences, Soil Science and Geomorphology, University of Tübingen, 72070 Tübingen, Germany

DOI: https://doi.org/10.3390/rs12071095
Journal volume & issue: Vol. 12, no. 7
p. 1095

Abstract

Read online

Understanding the spatial distribution of soil organic carbon (SOC) content over different climatic regions will enhance our knowledge of carbon gains and losses due to climatic change. However, little is known about the SOC content in the contrasting arid and sub-humid regions of Iran, whose complex SOC–landscape relationships pose a challenge to spatial analysis. Machine learning (ML) models with a digital soil mapping framework can solve such complex relationships. Current research focusses on ensemble ML models to increase the accuracy of prediction. The usual ensemble method is boosting or weighted averaging. This study proposes a novel ensemble technique: the stacking of multiple ML models through a meta-learning model. In addition, we tested the ensemble through rescanning the covariate space to maximize the prediction accuracy. We first applied six state-of-the-art ML models (i.e., Cubist, random forests (RF), extreme gradient boosting (XGBoost), classical artificial neural network models (ANN), neural network ensemble based on model averaging (AvNNet), and deep learning neural networks (DNN)) to predict and map the spatial distribution of SOC content at six soil depth intervals for both regions. In addition, the stacking of multiple ML models through a meta-learning model with/without rescanning the covariate space were tested and applied to maximize the prediction accuracy. Out of six ML models, the DNN resulted in the best modeling accuracies, followed by RF, XGBoost, AvNNet, ANN, and Cubist. Importantly, the stacking of models indicated a significant improvement in the prediction of SOC content, especially when combined with rescanning the covariate space. For instance, the RMSE values for SOC content prediction of the upper 0–5 cm of the soil profiles of the arid site and the sub-humid site by the proposed stacking approaches were 17% and 9% respectively, less than that obtained by the DNN models—the best individual model. This indicates that rescanning the original covariate space by a meta-learning model can extract more information and improve the SOC content prediction accuracy. Overall, our results suggest that the stacking of diverse sets of models could be used to more accurately estimate the spatial distribution of SOC content in different climatic regions.

Published in Remote Sensing

ISSN: 2072-4292 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science
Website: http://www.mdpi.com/journal/remotesensing/

About the journal

Abstract

Keywords