Artificial Intelligence in Agriculture (Mar 2023)
t-SNE: A study on reducing the dimensionality of hyperspectral data for the regression problem of estimating oenological parameters
Abstract
In recent years there is a growing importance in using machine learning techniques to improve procedures in precision agriculture: in this work we perform a study on models capable of predicting oenological parameters from hyperspectral images of wine grape berries, a specially relevant topic to boost production tasks for winemakers. Specifically, we explore the capabilities of a novel technique mostly used for visualization, t-Distributed Stochastic Neighbor Embedding (t-SNE), for reducing the dimensionality of the highly complex hyperspectral data and compare its performance with Principal Component Analysis (PCA) method, which despite the introduction of many nonlinear dimensionality reduction techniques over the years, had achieved the best results for real-world data across several studies in literature. Additionally we explore the potential of Kernel t-SNE, an extension to the t-SNE method that allows for the usage of the technique in streaming data or online scenarios. Our results show that, in a direct comparison, t-SNE achieves better metrics than PCA for most of the data sets in this work and that the regressor (Support Vector Regression, SVR) performs better with the t-SNE reduced features as inputs, accomplishing better predictions with lower error rates. Comparing the results with current literature, our shallow learning model paired with t-SNE achieves either better or on par results than those reported, even competing with more advanced models that use deep learning techniques, which should propel the introduction of t-SNE in more studies that require dimensionality reduction.