CMENet: A Cross-Modal Enhancement Network for Tobacco Leaf Grading

Qinglin He; Xiaobing Zhang; Jianxin Hu; Zehua Sheng; Qi Li; Si-Yuan Cao; Hui-Liang Shen

doi:10.1109/ACCESS.2023.3321111

IEEE Access (Jan 2023)

CMENet: A Cross-Modal Enhancement Network for Tobacco Leaf Grading

Qinglin He,
Xiaobing Zhang,
Jianxin Hu,
Zehua Sheng,
Qi Li,
Si-Yuan Cao,
Hui-Liang Shen

Affiliations

Qinglin He: College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, China
Xiaobing Zhang: China Tobacco Zhejiang Industrial Company Ltd., Hangzhou, China
Jianxin Hu: College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, China
Zehua Sheng: ORCiD; College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, China
Qi Li: China Tobacco Zhejiang Industrial Company Ltd., Hangzhou, China
Si-Yuan Cao: ORCiD; College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, China
Hui-Liang Shen: ORCiD; College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, China

DOI: https://doi.org/10.1109/ACCESS.2023.3321111
Journal volume & issue: Vol. 11
pp. 109201 – 109212

Abstract

Read online

Tobacco leaf grading plays a crucial role in ensuring the quality of tobacco production. For a very long period, the grading process is manually determined by experienced experts. In recent years, some methods have been introduced to automate the grading process by utilizing the reflection images of tobacco leaves. However, the high visual similarity among reflection images at different grades renders a single reflection image insufficient for achieving accurate grading. Besides, the tobacco leaves with an identical grade may have inconsistent visual appearances due to their different planting locations. It is known that an expert integrates multimodal information such as visual, tactile, and planting location cues when performing grading. Inspired by this, we propose an end-to-end Cross-modal Enhancement Network, named CMENet, for automatic tobacco leaf grading. In addition to the common reflection image, the network also adopts a transmission image to incorporate the thickness information in manual grading. CMENet comprises a difference-aware fusion module and a meta self-attention module, enabling the extraction of multimodal information from the transmission image and the planting location, respectively. Experimental results demonstrate that our CMENet achieves a high grading accuracy (80.15%) when incorporating multimodal information, surpassing the performance of existing methods that rely solely on reflection images.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords