IEEE Access (Jan 2020)

A Bootstrapping Soft Shrinkage Approach and Interval Random Variables Selection Hybrid Model for Variable Selection in Near-Infrared Spectroscopy

  • Hasan Ali Gamal Al-Kaf,
  • Nayef Abdulwahab Mohammed Alduais,
  • Abdul-Malik H. Y. Saad,
  • Kim Seng Chia,
  • Abdulqader M. Mohsen,
  • Hitham Alhussian,
  • Ammar Abdo Mohammed Haidar Mahdi,
  • Wan Saiful-Islam Wan Salam

DOI
https://doi.org/10.1109/ACCESS.2020.3023681
Journal volume & issue
Vol. 8
pp. 168036 – 168052

Abstract

Read online

High dimensionality problem in spectra datasets is a significant challenge to researchers and requires the design of effective methods that can extract the optimal variable subset that can improve the accuracy of predictions or classifications. In this study, a hybrid variable selection method, based on the incremental number of variables using bootstrapping soft shrinkage method (BOSS) and interval random variable selection (IRVS) method is proposed and named BOSS-IRVS. The BOSS method is used to determine the informative intervals, while the IRVS method is used to search for informative variables in the informative interval determined by BOSS method. The proposed BOSS-IRVS method was tested using seven different public accessible near-infrared (NIR) spectroscopic datasets of corn, diesel fuel, soy, wheat protein, and hemoglobin types. The performance of the proposed method was compared with that of two outstanding variable selection methods i.e. BOSS and hybrid variable selection strategy based on continuous shrinkage of variable space (VCPA-IRIV). The experimental results showed clearly that the proposed method BOSS-IRVS outperforms VCPA-IRIV and BOSS methods in all tested datasets and improved the percentage of the prediction accuracy, by 15.4 and 15.3 for corn moisture,13.4 and 49.8 for corn oil, 41.5 and 50.6 for corn protein, 12.6 and 5.6 for soy moisture, 0.6 and 6.3 for total diesel fuel, 19.9 and 14.3 for wheat protein, and 5.8 and 20.3 for hemoglobin.

Keywords