Agronomy (Nov 2023)
Integration of Vis–NIR Spectroscopy and Machine Learning Techniques to Predict Eight Soil Parameters in Alpine Regions
Abstract
Visible and near-infrared spectroscopy (Vis–NIR, 350–1100 nm) has great potential for predicting soil properties. However, current research on the hyperspectral prediction of soil parameters in agricultural areas of alpine regions and the types of parameters included is limited, and optimal spectral treatments and predictive models applicable to different parameters have not been sufficiently investigated. Therefore, we evaluated the accuracy of predicting total nitrogen (TN), phosphorus pentoxide (TP2O5), total potassium oxide (TK2O), alkali-hydrolyzable nitrogen (AHN), effective phosphorus (AP), effective potassium (AK), soil organic matter (SOM), and pH in the Qinghai–Tibet Plateau using the Vis–NIR technique in combination with spectral transformations, correlation analysis, feature selection, and machine learning. The results show that spectral transformations improve the correlation between spectra and parameters but are dependent on the parameter type and the method used. Continuum removal (CR), logarithmic first-order differential (FDL), and inverse first-order differential (FDR) had the most significant effects. The feature bands were extracted using the SPA and modeled using partial least squares (PLSR), random forest (RF), support vector machine (SVM), extreme gradient boosting (XGBoost), and backpropagation neural networks (BPNNs). The accuracy was evaluated based on R2, RMSE, RPD, and RPIQ. We found that the PLSR model only enables the prediction of SOM and pH with lower accuracy than the remaining models. XGBoost can predict all of the parameters but only for AHN; the prediction performance is better than other methods (R2 = 0.776, RMSE = 0.043 g/kg, and RPIQ = 2.88). The RF, SVM, and BPNN models cannot predict AK, AP, and AHN, respectively. In addition, TP2O5, AP, and pH are best suited for modeling using RF (RPIQ = 2.776, 3.011, and 3.198); TN, AK, and SOM are best suited for modeling using BPNN (RPIQ = 2.851, 2.394, and 3.085); and AHN and TK2O are best suited for XGBoost and SVM, respectively (RPIQ = 2.880 and 3.217). Therefore, this study can provide technical and data support for the accurate and efficient acquisition of soil parameters in alpine agriculture.
Keywords