OENO One (Nov 2022)

Accurate varietal classification and quantification of key quality compounds of grape extracts using the absorbance-transmittance fluorescence excitation emission matrix (A-TEEM) method and machine learning

  • Adam Gilmore,
  • Qiang Sui,
  • Bryant Blair,
  • Bruce S. Pan

DOI
https://doi.org/10.20870/oeno-one.2022.56.4.5561
Journal volume & issue
Vol. 56, no. 4

Abstract

Read online

Rapid and accurate quantification of grape berry phenolics, anthocyanins and tannins and identification of grape varieties are both important for effective quality control of harvesting and initial processing for winemaking. Current reference technologies, including High-Performance Liquid Chromatography (HPLC), can be rate-limiting and too complex and expensive for effective field operations. In this paper, we analyse robotically prepared grape extracts from several key varieties (n = Calibration/n = Prediction samples), including Cabernet-Sauvignon (64/10), Grenache (16/4), Malbec (14/4), Merlot (56/10), Petite Sirah (52/10), Pinot noir (54/8), Syrah (20/2), Teroldego (14/2) and Zinfandel (62/12). Key phenolic and anthocyanin parameters measured by HPLC included Catechin, Epicatechin, Quercetin Glycosides, Malvidin 3-glucoside, Total Anthocyanins and Polymeric Tannins. Split samples diluted 50-fold in 50 % EtOH pH 2 were analysed in parallel using the A-TEEM method following Multi-block Data Fusion of the absorbance and unfolded EEM data. A-TEEM chemical data were calibrated (n = 390) using Extreme Gradient Boosting (XGB) Regression and evaluated based on the Root Mean Square Error of the Prediction (RMSEP), the Relative Error of Prediction (REP) and Coefficient of Variation (R2P) of the Prediction data (n = 62). The regression results yielded an average Relative Error of Prediction (REP) of 5.89 ± 2.47 % and an R2P of 0.941 ± 0.025. While we consider the REP values to be in the acceptable range at significantly < 10 %, we acknowledge that both the grape extraction method repeatability and HPLC reference method sample repeatability (5-8 % RSD) likely constituted the major sources of variation compared to the A-TEEM instrumental sample repeatability (< 2 % RSD). The varietal classification was analysed using Agglomerative Hierarchical Cluster Analysis (HCA) and XGB discrimination analysis of the multi-block data. The classification results yielded 100 % True Positive and True Negative responses for the Calibration and Prediction Data for all tested varieties. We conclude that the A-TEEM method requires a minimum of sample preparation and rapid acquisition times (< 1 min) and can serve as an accurate secondary method for both grape varietal identification and phenolic quantification. Importantly, the software application of the regression and classification models can be effectively automated for operators.

Keywords