IEEE Access (Jan 2024)
GIVTED-Net: GhostNet-Mobile Involution ViT Encoder-Decoder Network for Lightweight Medical Image Segmentation
Abstract
Applying a deep learning-based model for medical image segmentation on resource-constrained devices involves substantial challenges. This task demands a model with decreased parameters and floating-point operations (FLOPs) to operate effectively on such devices. One common approach to tackle this challenge is the utilization of lightweight CNN-based architectures. Nevertheless, conventional CNN layers, such as convolution and pooling, demonstrate a spatial inductive bias that constrains their ability to instantly capture global context information. To address these limitations, the GIVTED-Net model is introduced. GIVTED-Net is specifically crafted as a lightweight encoder-decoder model for medical image segmentation, with 0.19M parameters and 0.56G FLOPs. It integrates Ghost bottlenecks for encoders and Mobile Involution ViT (MIViT) modules for decoders. Ghost bottlenecks, derived from the GhostNet architecture, offer computational efficiency. Additionally, the newly introduced MIViT modules combine the benefits of involution and MobileViT principles, along with MetaFormer blocks in their design. This led to the development of InvoFormer, a Transformer-like module based on involution and the squeeze-and-excitation mechanism. As a result, MIViT modules can effectively capture global context information while maintaining low parameter count and FLOPs. The performance evaluation of the proposed GIVTED-Net model is conducted on various datasets including Kvasir Instrument, ISIC-2018 Lesion Boundary Segmentation, and WBC Image datasets, covering different medical objects such as instruments, lesions, and anatomical structures. Notably, GIVTED-Net demonstrates superior performance compared to existing models, showcasing its suitability for deployment on resource-constrained devices.
Keywords