Food Innovation and Advances (Jan 2024)

Optimisation of vibrational spectroscopy instruments and pre-processing for classification problems across various decision parameters

  • Joy Sim,
  • Cushla McGoverin,
  • Indrawati Oey,
  • Russell Frew,
  • Biniam Kebede

DOI
https://doi.org/10.48130/fia-0024-0004
Journal volume & issue
Vol. 3, no. 1
pp. 52 – 63

Abstract

Read online

Vibrational spectroscopy is a green, rapid, and affordable analytical tool for analysing the quality, safety, and origin of biological materials in agri-food sectors. Pre-processing spectral data is crucial to removing instrumental interferences and physical artifacts when developing a classification model. However, there has yet to be a consensus on which spectral pre-processing method, settings, and decision parameters to use to optimise pre-processing for different spectroscopy tools. Using an arbitrary criterion poses a risk of applying the wrong type or too severe pre-processing that removes valuable information or affects the model's performance for prediction studies. Matthew's Correlation Coefficient (MCC) - a statistic for parameterising classification performance, accounts for data set imbalance and improved decisions on model selection to express uncertainty on future predictions. Four vibrational spectroscopy instruments [near-infrared (NIR), hyperspectral (HSI), mid-infrared (FTIR), and Raman] were compared using different pre-processing methods to understand the performance using MCC to classify coffee from four countries (Indonesia, Ethiopia, Brazil and Rwanda). Key decision parameters were evaluated for the development of reliable classification models. The best pre-processing for NIR was extended multiplicative scatter correction with mean centering (MNCN), and for HSI, Savitzky-Golay (1st derivative, 15 points) with MNCN. NIR performed the best across all four instruments, with FTIR performing the worst. Raman showed potential for coffee origin classification using the right pre-processing. Pre-processing with weighted least squares, normalisation, and MNCN eliminated the fluorescence effect on Raman spectral data. These findings show the feasibility of using MCC for classification problems.

Keywords