Agriculture (Jul 2023)

Predicting Models for Plant Metabolites Based on PLSR, AdaBoost, XGBoost, and LightGBM Algorithms Using Hyperspectral Imaging of <i>Brassica juncea</i>

  • Hyo In Yoon,
  • Hyein Lee,
  • Jung-Seok Yang,
  • Jae-Hyeong Choi,
  • Dae-Hyun Jung,
  • Yun Ji Park,
  • Jai-Eok Park,
  • Sang Min Kim,
  • Soo Hyun Park

DOI
https://doi.org/10.3390/agriculture13081477
Journal volume & issue
Vol. 13, no. 8
p. 1477

Abstract

Read online

The integration of hyperspectral imaging with machine learning algorithms has presented a promising strategy for the non-invasive and rapid detection of plant metabolites. For this study, we developed prediction models using partial least squares regression (PLSR) and boosting algo-rithms (such as AdaBoost, XGBoost, and LightGBM) for five metabolites in Brassica juncea leaves: total chlorophyll, phenolics, flavonoids, glucosinolates, and anthocyanins. To enhance the model performance, we employed several spectral data preprocessing methods and feature-selection al-gorithms. Our results showed that the boosting algorithms generally outperformed the PLSR models in terms of prediction accuracy. In particular, the LightGBM model for chlorophyll and the AdaBoost model for flavonoids improved the prediction performance, with R2p = 0.71–0.74, com-pared to the PLSR models (R2p = 0.53–0.58). The final models for the glucosinolates and anthocya-nins performed sufficiently for practical uses such as screening, with R2p = 0.82–0.85 and RPD = 2.4–2.6. Our findings indicate that the application of a single preprocessing method is more effective than utilizing multiple techniques. Additionally, the boosting algorithms with feature selection ex-hibited superior performance compared to the PLSR models in the majority of cases. These results highlight the potential of hyperspectral imaging and machine learning algorithms for the non-destructive and rapid detection of plant metabolites, which could have significant implications for the field of smart agriculture.

Keywords