IEEE Access (Jan 2018)
Variable Selection and Optimization in Rapid Detection of Soybean Straw Biomass Based on CARS
Abstract
Data are the basis of analysis and modeling. It is essential and necessary for performance optimization to extract useful data or characteristic variables from a large amount of data. In this paper, variable selection and trusted computing in rapid detection of soybean straw biomass is studied. Competitive adaptive reweighted sampling (CARS) is a variable selection method, which simulate the survival of the fittest principle of Darwin's evolution theory. CARS is applied in to soybean straw near infrared spectroscopy (NIRS) analysis data and compared with other variable selection methods, such as interval partial least squares (iPLS), synergy interval partial least square (siPLS), backward interval partial least square (biPLS), and successive projection algorithm (SPA). Built PLS model using the optimization data based on above five methods, and contrasted the root-mean-square error of prediction (RMSEP) of the PLS model and the difference between root-mean-square error of cross validation (RMSECV) and RMSEP. The experimental results show that the iPLS and SPA methods are not suitable for analyzing soybean straw NIRS data. Also, the CARS method is found to be more effective in extracting character variables than siPLS and biPLS. Performance of prediction model is improved by variable selection. In contrast to the global PLS model, the RMSECV is reduced to 0.6572 from 0.8082, the RMSECP is reduced to 0.5710 from 0.6250, and correlation coefficient of the calibration model is increased to 0.8162 from 0.7087 for the hemicellulose. The difference between RMSECV and RMSEP is small, so the model will be stable. Similar results appear in the cellulose and lignin analyzing process. It is concluded that CARS method is effective in variables selection and optimization for soybean straw NIRS data.
Keywords