SwinUNeLCsT: Global–local spatial representation learning with hybrid CNN–transformer for efficient tuberculosis lung cavity weakly supervised semantic segmentation

Zhuoyi Tan; Hizmawati Madzin; Bahari Norafida; Rahmita Wirza OK Rahmat; Fatimah Khalid; Puteri Suhaiza Sulaiman

Journal of King Saud University: Computer and Information Sciences (Apr 2024)

SwinUNeLCsT: Global–local spatial representation learning with hybrid CNN–transformer for efficient tuberculosis lung cavity weakly supervised semantic segmentation

Zhuoyi Tan,
Hizmawati Madzin,
Bahari Norafida,
Rahmita Wirza OK Rahmat,
Fatimah Khalid,
Puteri Suhaiza Sulaiman

Affiliations

Zhuoyi Tan: Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, 43400, Malaysia
Hizmawati Madzin: Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, 43400, Malaysia; Corresponding author.
Bahari Norafida: Department of Radiology, Universit Putra Malaysia, 43400 Serdang, Selangor, Malaysia
Rahmita Wirza OK Rahmat: Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, 43400, Malaysia
Fatimah Khalid: Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, 43400, Malaysia
Puteri Suhaiza Sulaiman: Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, 43400, Malaysia

Journal volume & issue: Vol. 36, no. 4
p. 102012

Abstract

Read online

Radiological diagnosis of lung cavities (LCs) is the key to identifying tuberculosis (TB). Conventional deep learning methods rely on a large amount of accurate pixel-level data to segment LCs. This process is time-consuming and laborious, especially for those subtle LCs. To address such challenges, firstly, we introduce a novel 3D TB LCs imaging convolutional neural network (CNN)-transformer hybrid model (SwinUNeLCsT). The core idea of SwinUNeLCsT is to combine local details and global dependencies for TB CT scan image feature representation to effectively improve the recognition ability of LCs. Secondly, to reduce the dependence on accurate pixel-level annotations, we design an end-to-end LCs weakly supervised semantic segmentation (WSSS) framework. Through this framework, radiologists need only to classify the number and the approximate location (e.g., left lung, right lung, or both) of LCs in the CT scan to achieve efficient segmentation of the LCs. This process eliminates the need for meticulously drawing boundaries, greatly reducing the cost of annotation. Extensive experimental results show that SwinUNeLCsT outperforms currently popular medical 3D segmentation methods in the supervised semantic segmentation paradigm. Meanwhile, our WSSS framework based on SwinUNeLCsT also performs best among the existing state-of-the-art medical 3D WSSS methods.

Published in Journal of King Saud University: Computer and Information Sciences

ISSN: 1319-1578 (Print)
Publisher: Elsevier
Country of publisher: Saudi Arabia
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://www.journals.elsevier.com/journal-of-king-saud-university-computer-and-information-sciences/

About the journal

Abstract

Keywords