Frontiers in Plant Science (May 2024)

Tackling unbalanced datasets for yellow and brown rust detection in wheat

  • Carmen Cuenca-Romero,
  • Orly Enrique Apolo-Apolo,
  • Jaime Nolasco Rodríguez Vázquez,
  • Gregorio Egea,
  • Manuel Pérez-Ruiz

DOI
https://doi.org/10.3389/fpls.2024.1392409
Journal volume & issue
Vol. 15

Abstract

Read online

This study evaluates the efficacy of hyperspectral data for detecting yellow and brown rust in wheat, employing machine learning models and the SMOTE (Synthetic Minority Oversampling Technique) augmentation technique to tackle unbalanced datasets. Artificial Neural Network (ANN), Support Vector Machine (SVM), Random Forest (RF), and Gaussian Naïve Bayes (GNB) models were assessed. Overall, SVM and RF models showed higher accuracies, particularly when utilizing SMOTE-enhanced datasets. The RF model achieved 70% accuracy in detecting yellow rust without data alteration. Conversely, for brown rust, the SVM model outperformed others, reaching 63% accuracy with SMOTE applied to the training set. This study highlights the potential of spectral data and machine learning (ML) techniques in plant disease detection. It emphasizes the need for further research in data processing methodologies, particularly in exploring the impact of techniques like SMOTE on model performance.

Keywords