IEEE Access (Jan 2023)
CMENet: A Cross-Modal Enhancement Network for Tobacco Leaf Grading
Abstract
Tobacco leaf grading plays a crucial role in ensuring the quality of tobacco production. For a very long period, the grading process is manually determined by experienced experts. In recent years, some methods have been introduced to automate the grading process by utilizing the reflection images of tobacco leaves. However, the high visual similarity among reflection images at different grades renders a single reflection image insufficient for achieving accurate grading. Besides, the tobacco leaves with an identical grade may have inconsistent visual appearances due to their different planting locations. It is known that an expert integrates multimodal information such as visual, tactile, and planting location cues when performing grading. Inspired by this, we propose an end-to-end Cross-modal Enhancement Network, named CMENet, for automatic tobacco leaf grading. In addition to the common reflection image, the network also adopts a transmission image to incorporate the thickness information in manual grading. CMENet comprises a difference-aware fusion module and a meta self-attention module, enabling the extraction of multimodal information from the transmission image and the planting location, respectively. Experimental results demonstrate that our CMENet achieves a high grading accuracy (80.15%) when incorporating multimodal information, surpassing the performance of existing methods that rely solely on reflection images.
Keywords