Meitan xuebao (Jul 2023)
Inversion of coal-derived carbon content in mine soils based on hyperspectral index and machine learning
Abstract
Coal particles, a typical of organic matter with high carbon content, widely diffused in soil environment due to massive consumption of coal energy for centuries. Even in small amounts presented in soils, coal particles can produce obvious overestimation of soil organic carbon (SOC) and thus increase an uncertainty of soil C sequestration assessment. However, there is a lack of determination methods for distinguishing coal-derived C from SOC. This study takes the Jiaozuo mining area as the study area, which has a history of anthracite mining for more than 100 years and where coal-contaminated soils are widespread. The hyperspectral characteristics of the mine soils were analyzed by collecting coal particles and coal-free soils and then mixing known quantities of the two samples manually. Based on eight spectral mathematical transformation and two spectral feature screening methods (i.e., traditional correlation coefficiencies, and comprehensive competitive adaptive reweighted sampling (CARS)), six inversion models including spectral feature index models (i.e., deviation of arch (IDOA), difference index (ID), ratio index (IR)), and three machine learning models (i.e., partial least squares regression (PLSR), support vector machine (SVM) and random forest (RF)), were established to inversing coal-derived C content. The applicability of the established optimal inversion model was also examined. It was found that in the wavelength range from 350 nm to 2500 nm, the spectral curves of coal particles are obviously different from those of plant-derived organic matter and coal-free soils. Moreover, the spectral reflectance (R) of coal-contaminated soils decreases with increasing coal-derived C content. These findings provide a theoretical basis for the application of hyperspectral remote sensing technology to quantitatively inverse coal-derived C of mine soils. The CARS-extracted feature wavebands of coal-derived C distributed evenly in the wavelength range between 350 nm and 2500 nm, and the number of feature waveband extracted by the CARS was far higher than that by traditional correlation coefficiency methods. After mathematical transformation of R data, the estimated accuracies of coal-derived C content produced by the IDOA, ID and IR inversion models were significantly enhanced, among which, the DI model of 1/R spectral transformation exhibited the highest estimation accuracy. Compared with the traditional index models, the three machine learning models combined with CARS further enhanced estimation accuracy of coal-derived C content. Among the three machine learning models, the 1/R-CARS-RF inversion model produced the highest estimation accuracy, showing \begin{document}$R_{\rm{m}}^2 $\end{document}=0.998, RMSE=0.348, RPD=29.943 for its validation set. The applicability test showed that the 1/R-CARS-RF model exhibited a good applicability for different coal-contaminated soils in the Jiaozuo mining area as observed 1.88% for RMSE and 4.97 for RPD. Therefore, it can be expected that hyperspectral remote sensing technology has a promising application prospect for determining coal-derived C content in mine soils. In addition, this study can provide a methodology support for distinguishing coal-derived C from SOC in mine soils and accurately assessing soil carbon sequestration.
Keywords