Discover Sustainability (Jun 2024)

Efficient identification and classification of apple leaf diseases using lightweight vision transformer (ViT)

  • Wasi Ullah,
  • Kashif Javed,
  • Muhammad Attique Khan,
  • Faisal Yousef Alghayadh,
  • Mohammed Wasim Bhatt,
  • Imad Saud Al Naimi,
  • Isaac Ofori

DOI
https://doi.org/10.1007/s43621-024-00307-1
Journal volume & issue
Vol. 5, no. 1
pp. 1 – 14

Abstract

Read online

Abstract The timely diagnosis and identification of apple leaf diseases is essential to prevent the spread of diseases and ensure the sound development of the apple industry. Convolutional neural networks (CNNs) have achieved phenomenal success in the area of leaf disease detection, which can greatly benefit the agriculture industry. However, their large size and intricate design continue to pose a challenge when it comes to deploying these models on lightweight devices. Although several successful models (e.g., EfficientNets and MobileNets) have been designed to adapt to resource-constrained devices, these models have not been able to achieve significant results in leaf disease detection tasks and leave a performance gap behind. This research gap has motivated us to develop an apple leaf disease detection model that can not only be deployed on lightweight devices but also outperform existing models. In this work, we propose AppViT, a hybrid vision model, combining the features of convolution blocks and multi-head self-attention, to compete with the best-performing models. Specifically, we begin by introducing the convolution blocks that narrow down the size of the feature maps and help the model encode local features progressively. Then, we stack ViT blocks in combination with convolution blocks, allowing the network to capture non-local dependencies and spatial patterns. Embodied with these designs and a hierarchical structure, AppViT demonstrates excellent performance in apple leaf disease detection tasks. Specifically, it achieves 96.38% precision on Plant Pathology 2021—FGVC8 with about 1.3 million parameters, which is 11.3% and 4.3% more accurate than ResNet-50 and EfficientNet-B3. The precision, recall and F score of our proposed model on Plant Pathology 2021—FGVC8 are 0.967, 0.959, and 0.963 respectively.

Keywords