Heliyon (Sep 2024)

A two-step framework integrating lasso and Relaxed Lasso for resolving multidimensional collinearity in Chinese baijiu aging research

  • Dongyue An,
  • Liangyan Wang,
  • Jiang He,
  • Yuejin Hua

Journal volume & issue
Vol. 10, no. 17
p. e36871

Abstract

Read online

The aging process is crucial for Chinese Baijiu production, significantly enhancing the spirit's flavor, aroma and quality. However, aging involves a complex interplay of numerous compounds, and the extensive duration required for aging leads to a scarcity of samples available for scientific research. These limitations pose a challenge in analyzing high-dimensional data with collinearity, complicating the understanding of the intricate chemical processes at play. In this article, a two-step framework was proposed that integrated Relaxed Lasso regression models with Lasso-selected predictors to address this issue. Baijiu samples subjected to various aging conditions were analyzed using direct GC-MS and HS-GC-MS, and the obtained data was processed by this approach. The results demonstrate significantly superior performance compared to other methods, including PLSR and Gradient Boosting. Analyses were also performed on a previously documented dataset, yielding enhanced results and underscoring the method's advantage in processing high dimensional data with multicollinearity. Moreover, this method proved effective in screening of potential indicative compounds, highlighting its utility in Baijiu aging research.

Keywords