Geoderma (Oct 2024)
Enhanced VNIR and MIR proximal sensing of soil organic matter and PLFA-derived soil microbial properties through machine learning ensembles and external parameter orthogonalization
Abstract
Portable visible-to-near-infrared (VNIR) and mid-infrared (MIR) spectroscopy coupled with machine learning can provide detailed and inexpensive information on various key soil properties. However, on-site VNIR and MIR proximal sensing applications are hampered by soil moisture and particle size variations, which distort reflectance spectra collected on field-condition soils and impede the integration of established MIR and VNIR soil spectral libraries in predictive models for field measurements.In this study, we explored the capacity of various machine-learning approaches to calibrate VNIR-MIR models for the prediction of soil organic carbon and phospholipid fatty acid (PLFA)-derived microbial soil properties with field-condition spectral data. We further evaluated the potential to integrate soil spectral libraries into VNIR-MIR proximal sensing applications by testing the transfer of VNIR-MIR models calibrated on pre-treated soil samples to field-condition VNIR-MIR scans using the External Parameter Orthogonalization (EPO) approach to minimize soil moisture and particle size effects.We compiled a diverse soil dataset encompassing a wide range of organic matter content, soil texture, and parent material from soils under grassland and arable land use (n = 175). VNIR-MIR models were used to predict soil organic carbon (SOC), bacterial biomass (BAC), fungal biomass (FUN), and different soil quality indicators (C:N, Fungal-to-bacterial ratio, gram-positive-to-gram-negative ratio) for both field-condition and pre-treated soil spectral data. Calibrations were developed with Partial Least Squares Regression (PLSR), Random Forest (RF), Elastic Net (ENET), Cubist, Support Vector Machines (SVM), and an Ensemble-GLM. We further tested the effectiveness of coupling each machine-learning model with the EPO algorithm to transfer models calibrated on pre-treated soils to field-condition scans.Our results show that machine learning methods such as Cubist and SVM readily outperformed the standard PLSR calibration, with average improvements of ΔRMSE ∼15 % for pre-treated soils and ΔRMSE ∼10 % for field-condition samples. Ensemble-GLM models were about as accurate as the best individual model in each case but did not yield further improvements. The direct calibration transfer from laboratory calibrations to field-condition spectra exhibited very low accuracy. The EPO approach improved model transfer results significantly (ΔRMSE ∼40 %) but was still less accurate than predictive models using spectra from pre-treated soils (ΔRMSE ∼18 %).Our findings highlight the benefits of employing a diverse set of machine-learning algorithms and model ensembles for improved VNIR-MIR calibrations of soil properties and demonstrate that the EPO transform is effective in removing moisture and particle size effects from VNIR and MIR soil spectra collected in field-condition. This opens the opportunity to integrate archived local soil data or extensive soil spectral libraries into proximal soil sensing applications with portable VNIR and MIR spectrometers to facilitate the acquisition of high-quality soil information at high spatiotemporal resolution directly in the field.