Plant Methods (Mar 2025)
DWTFormer: a frequency-spatial features fusion model for tomato leaf disease identification
Abstract
Abstract Remarkable inter-class similarity and intra-class variability of tomato leaf diseases seriously affect the accuracy of identification models. A novel tomato leaf disease identification model, DWTFormer, based on frequency-spatial feature fusion, was proposed to address this issue. Firstly, a Bneck-DSM module was designed to extract shallow features, laying the groundwork for deep feature extraction. Then, a dual-branch feature mapping network (DFMM) was proposed to extract multi-scale disease features from frequency and spatial domain information. In the frequency branch, a 2D discrete wavelet transform feature decomposition module effectively captured the rich frequency information in the disease image, compensating for spatial domain information. In the spatial branch, a multi-scale convolution and PVT (Pyramid Vision Transformer)-based module was developed to extract the global and local spatial features, enabling comprehensive spatial representation. Finally, a dual-domain features fusion model based on dynamic cross-attention was proposed to fuse the frequency-spatial features. Experimental results on the tomato leaf disease dataset demonstrated that DWTFormer achieved 99.28% identification accuracy, outperforming most existing mainstream models. Furthermore, 96.18% and 99.89% identification accuracies have been obtained on the AI Challenger 2018 and PlantVillage datasets. In-field experiments demonstrated that DWTFormer achieved an identification accuracy of 97.22% and an average inference time of 0.028 seconds in real plant environments. This work has effectively reduced the impact of inter-class similarity and intra-class variability on tomato leaf disease identification. It provides a scalable model reference for fast and accurate disease identification.
Keywords