Case Studies in Chemical and Environmental Engineering (Dec 2023)

A novel strategy of NIR spectra multivariate calibration in the presence both of small dataset and non-linearity: A comparative study

  • Devianti,
  • Adi Saputra Ismy,
  • Herbert Hasudungan Siahaan,
  • Agustami Sitorus

Journal volume & issue
Vol. 8
p. 100384

Abstract

Read online

The presence of non-linearity and, at the same time, the small number of datasets are often the constraints that appear together in the case of NIR spectra (381–1065nm). This makes some chemometricians think again about presenting a reasonable and robust calibration model using the linear calibration method. On the other hand, even though obtaining a high and robust calibration model, the NIR spectra-based approach is still an alternative method that must still consider low cost and ease of getting it. This study introduces a novel strategy for developing robust calibration models from small and non-linearity NIR spectra datasets. The prediction performance of two groups of chemometric methods, linear (partial least squares regression, PLSR) and non-linear calibration techniques (k-nearest neighbor, k-NN; Ada boosting, AB; Bayesian ridge regression, BRR), were also compared and investigated in depth. A total of forty raw NIR spectral data was used to develop a calibration model to predict the content of B-pinene, D-limonene, and safrole from the nutmeg fruit. The first strategy, non-linearity due to the effect of light scattering on the NIR spectral data, will be handled directly by the non-linear calibration technique algorithm from machine learning to generate the non-linearity model without preprocessing techniques. The second strategy, the robustness of the model, is tested by performing random splitting of data several times without supervision and ending with a rigorous statistical procedure adopted to ensure reliable comparison. The results suggest that the non-linear calibration method is the most promising among the investigated methods. Furthermore, although none of the techniques is always the best to predict on all references, k-NN (for prediction of B-pinene and safrole) and BRR (for prediction of B-pinene and D-limonene), some of them are found to be the most promising in terms of low prediction error (the maximum Rp2 is 81.6%, and RMSE is less than 1.139%). There are non-linear calibration techniques explored with limited success being achieved.

Keywords