CerviFusionNet: A multi-modal, hybrid CNN-transformer-GRU model for enhanced cervical lesion multi-classification

Yuyang Sha; Qingyue Zhang; Xiaobing Zhai; Menghui Hou; Jingtao Lu; Weiyu Meng; Yuefei Wang; Kefeng Li; Jing Ma

iScience (Dec 2024)

CerviFusionNet: A multi-modal, hybrid CNN-transformer-GRU model for enhanced cervical lesion multi-classification

Yuyang Sha,
Qingyue Zhang,
Xiaobing Zhai,
Menghui Hou,
Jingtao Lu,
Weiyu Meng,
Yuefei Wang,
Kefeng Li,
Jing Ma

Affiliations

Yuyang Sha: Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China
Qingyue Zhang: First Teaching Hospital of Tianjin University of Traditional Chinese Medicine, Tianjin 300381, China; National Clinical Research Center for Chinese Medicine Acupuncture and Moxibustion, Tianjin 300381, China
Xiaobing Zhai: Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China
Menghui Hou: First Teaching Hospital of Tianjin University of Traditional Chinese Medicine, Tianjin 300381, China; National Clinical Research Center for Chinese Medicine Acupuncture and Moxibustion, Tianjin 300381, China
Jingtao Lu: Beijing University of Technology, School of Mathematical Statistics and Mechanics, Beijing 100124, China
Weiyu Meng: Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China
Yuefei Wang: National Key Laboratory of Chinese Medicine Modernization, State Key Laboratory of Component-based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin 301617, China; Haihe Laboratory of Modern Chinese Medicine, Tianjin 301617, China
Kefeng Li: Center for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macau SAR 999078, China; Corresponding author
Jing Ma: First Teaching Hospital of Tianjin University of Traditional Chinese Medicine, Tianjin 300381, China; National Clinical Research Center for Chinese Medicine Acupuncture and Moxibustion, Tianjin 300381, China; Corresponding author

Journal volume & issue: Vol. 27, no. 12
p. 111313

Abstract

Read online

Summary: Cervical lesions pose a significant threat to women’s health worldwide. Colposcopy is essential for screening and treating cervical lesions, but its effectiveness depends on the doctor’s experience. Artificial intelligence-based solutions via colposcopy images have shown great potential in cervical lesions screening. However, some challenges still need to be addressed, such as low algorithm performance and lack of high-quality multi-modal datasets. Here, we established a multi-modal colposcopy dataset of 2,273 HPV+ patients, comprising original colposcopy images, acetic acid reactions at 60s and 120s, iodine staining, diagnostic reports, and pathological results. Utilizing this dataset, we developed CerviFusionNet, a hybrid architecture that merges convolutional neural networks and vision transformers to learn robust representations. We designed a temporal module to capture dynamic changes in acetic acid sequences, which can boost the model performance without sacrificing inference speed. Compared with several existing methods, CerviFusionNet demonstrated excellent accuracy and efficiency.

Published in iScience

ISSN: 2589-0042 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Science
Website: http://www.cell.com/iscience/home

About the journal

Abstract

Keywords