Talanta Open (Aug 2023)
Guidelines to build PLS-DA chemometric classification models using a GC-IMS method: Dry-cured ham as a case of study
Abstract
The number of representative samples to build a calibration model plays a major role in the success of chemometric models for class discrimination; therefore, knowing which samples should be used for the calibration of prediction models is essential. The aim of this work is to design a basic guideline for the training of partial least squares discriminant analysis (PLS-DA) models to classify complex samples analysed by Gas Chromatography (GC) coupled to Ion Mobility Spectrometry (IMS) using dry-cured Iberian ham as an example. The effect of the number, proportion and class of samples for training and validation and the use of two data types (spectral fingerprint or pre-selected markers) has been assessed by analysing with GC-IMS nearly 1000 dry-cured Iberian ham samples obtained from 7 different curing plants. Subsequently, these were classified with PLS-DA according to the pig's feeding regime (acorn-fed vs. feed-fed) and it has been demonstrated that 450 out of 997 samples are enough for model training to achieve a maximum average prediction accuracy rate. Furthermore, the use of pre-selected GC-IMS markers provides slightly better prediction results than the use of the complete spectral fingerprint. In summary, these results represent a tentative guide for the classification of samples in an industrial setting using GC-IMS and PLS-DA. This methodology would allow authorities and producers to ensure the quality of the agri-food products put on the market as is proven in this study.