Applied Sciences (May 2024)
SkinSwinViT: A Lightweight Transformer-Based Method for Multiclass Skin Lesion Classification with Enhanced Generalization Capabilities
Abstract
In recent decades, skin cancer has emerged as a significant global health concern, demanding timely detection and effective therapeutic interventions. Automated image classification via computational algorithms holds substantial promise in significantly improving the efficacy of clinical diagnoses. This study is committed to mitigating the challenge of diagnostic accuracy in the classification of multiclass skin lesions. This endeavor is inherently formidable owing to the resemblances among various lesions and the constraints associated with extracting precise global and local image features within diverse dimensional spaces using conventional convolutional neural network methodologies. Consequently, this study introduces the SkinSwinViT methodology for skin lesion classification, a pioneering model grounded in the Swin Transformer framework featuring a global attention mechanism. Leveraging the inherent cross-window attention mechanism within the Swin Transformer architecture, the model adeptly captures local features and interdependencies within skin lesion images while additionally incorporating a global self-attention mechanism to discern overarching features and contextual information effectively. The evaluation of the model’s performance involved the ISIC2018 challenge dataset. Furthermore, data augmentation techniques augmented training dataset size and enhanced model performance. Experimental results highlight the superiority of the SkinSwinViT method, achieving notable metrics of accuracy, recall, precision, specificity, and F1 score at 97.88%, 97.55%, 97.83%, 99.36%, and 97.79%, respectively.
Keywords