IEEE Access (Jan 2022)

Plant Identification in a Combined-Imbalanced Leaf Dataset

  • Viraj K. Gajjar,
  • Anand K. Nambisan,
  • Kurt L. Kosbar

DOI
https://doi.org/10.1109/ACCESS.2022.3165583
Journal volume & issue
Vol. 10
pp. 37882 – 37891

Abstract

Read online

Plant identification has applications in ethnopharmacology and agriculture. Since leaves are one of a distinguishable feature of a plant, they are routinely used for identification. Recent developments in deep learning have made it possible to accurately identify the majority of samples in five publicly available leaf datasets. However, each dataset captures the images in a highly controlled environment. This paper evaluates the performance of EfficientNet and several other convolutional neural network (CNN) architectures when applied to a combination of the LeafSnap, Middle European Woody Plants 2014, Flavia, Swedish, and Folio datasets. To normalize the impact of imbalance resulting from combining the original datasets, we used oversampling, undersampling, and transfer learning techniques to construct an end-to-end CNN classifier. We placed greater emphasis on metrics appropriate for a diverse-imbalanced dataset rather than stressing high performance on any one of the original datasets. A model from EfficientNet’s family of CNN models achieved a highly accurate F-score of 0.9861 on the combined dataset.

Keywords