Investigation of Seasonal Variation in Fatty Acid and Mineral Concentrations of Pecorino Romano PDO Cheese: Imputation of Missing Values for Enhanced Classification and Metabolic Profile Reconstruction
Leonardo Sibono,
Massimiliano Grosso,
Stefania Tronci,
Massimiliano Errico,
Margherita Addis,
Monica Vacca,
Cristina Manis,
Pierluigi Caboni
Affiliations
Leonardo Sibono
Department of Mechanical, Chemical and Materials Engineering, University of Cagliari, Via Marengo 2, 09123 Cagliari, Italy
Massimiliano Grosso
Department of Mechanical, Chemical and Materials Engineering, University of Cagliari, Via Marengo 2, 09123 Cagliari, Italy
Stefania Tronci
Department of Mechanical, Chemical and Materials Engineering, University of Cagliari, Via Marengo 2, 09123 Cagliari, Italy
Massimiliano Errico
Department of Green Technology, University of Southern Denmark, Campusvej 55, 5230 Odense, Denmark
Margherita Addis
Agris Sardegna, Servizio Ricerca Prodotti di Origine Animale, Agris Sardegna, Loc., Bonassai, 07040 Sassari, Italy
Monica Vacca
Servizio Ricerca Studi Ambientali, Difesa delle Colture e Qualità delle Produzioni, Viale Trieste, 09123 Cagliari, Italy
Cristina Manis
Dipartimento di Scienze della vita e Ambiente, Cittadella Universitaria di Monserrato Blocco A, 09012 Monserrato, Italy
Pierluigi Caboni
Dipartimento di Scienze della vita e Ambiente, Cittadella Universitaria di Monserrato Blocco A, 09012 Monserrato, Italy
Seasonal variation in fatty acids and minerals concentrations was investigated through the analysis of Pecorino Romano cheese samples collected in January, April, and June. A fraction of samples contained missing values in their fatty acid profiles. Probabilistic principal component analysis, coupled with Linear Discriminant Analysis, was employed to classify cheese samples on a production season basis while accounting for missing data and quantifying the missing fatty acid concentrations for the samples in which they were absent. The levels of rumenic acid, vaccenic acid, and omega-3 compounds were positively correlated with the spring season, while the length of the saturated fatty acids increased throughout the production seasons. Concerning the classification performances, the optimal number of principal components (i.e., 5) achieved an accuracy in cross-validation equal to 98%. Then, when the model was tasked with imputing the lacking fatty acid concentration values, the optimal number of principal components resulted in an R2 value in cross-validation of 99.53%.