Journal of Spectroscopy (Jan 2021)
Comparison of Machine Learning Classification Methods for Determining the Geographical Origin of Raw Milk Using Vibrational Spectroscopy
Abstract
One of the significant challenges in the food industry is the determination of the geographical origin, since products from different regions can lead to great variance in raw milk. Therefore, monitoring the origin of raw milk has become very relevant for producers and consumers worldwide. In this exploratory study, midinfrared spectroscopy combined with machine learning classification methods was investigated as a rapid and nondestructive method for the classification of milk according to its geographical origin. The curse of dimensionality makes some classification methods struggle to train efficient models. Thus, principal component analysis (PCA) has been applied to create a smaller set of features. The application of machine learning methods such as PLS-DA, PCA-LDA, SVM, and PCA-SVM demonstrates that the best results are obtained using PLS-DA, PCA-LDA, and PCA-SVM methods which show a correct classification rate (CCR) of 100% for PLS-DA and PCA-LDA and 94.95% for PCA-SVM, whereas the application of SVM without feature extraction gives a low CCR of 66.67%. These findings demonstrate that FT-MIR spectroscopy, combined with machine learning methods, is an efficient and suitable approach to classify the geographical origins of raw milk.