SkinSwinViT: A Lightweight Transformer-Based Method for Multiclass Skin Lesion Classification with Enhanced Generalization Capabilities

Kun Tang; Jing Su; Ruihan Chen; Rui Huang; Ming Dai; Yongjiang Li

doi:10.3390/app14104005

Applied Sciences (May 2024)

SkinSwinViT: A Lightweight Transformer-Based Method for Multiclass Skin Lesion Classification with Enhanced Generalization Capabilities

Kun Tang,
Jing Su,
Ruihan Chen,
Rui Huang,
Ming Dai,
Yongjiang Li

Affiliations

Kun Tang: School of Mathematics and Computer, Guangdong Ocean University, Zhanjiang 524008, China
Jing Su: School of Mathematics and Computer, Guangdong Ocean University, Zhanjiang 524008, China
Ruihan Chen: School of Mathematics and Computer, Guangdong Ocean University, Zhanjiang 524008, China
Rui Huang: School of Mathematics and Computer, Guangdong Ocean University, Zhanjiang 524008, China
Ming Dai: School of Mathematics and Computer, Guangdong Ocean University, Zhanjiang 524008, China
Yongjiang Li: School of Mathematics and Computer, Guangdong Ocean University, Zhanjiang 524008, China

DOI: https://doi.org/10.3390/app14104005
Journal volume & issue: Vol. 14, no. 10
p. 4005

Abstract

Read online

In recent decades, skin cancer has emerged as a significant global health concern, demanding timely detection and effective therapeutic interventions. Automated image classification via computational algorithms holds substantial promise in significantly improving the efficacy of clinical diagnoses. This study is committed to mitigating the challenge of diagnostic accuracy in the classification of multiclass skin lesions. This endeavor is inherently formidable owing to the resemblances among various lesions and the constraints associated with extracting precise global and local image features within diverse dimensional spaces using conventional convolutional neural network methodologies. Consequently, this study introduces the SkinSwinViT methodology for skin lesion classification, a pioneering model grounded in the Swin Transformer framework featuring a global attention mechanism. Leveraging the inherent cross-window attention mechanism within the Swin Transformer architecture, the model adeptly captures local features and interdependencies within skin lesion images while additionally incorporating a global self-attention mechanism to discern overarching features and contextual information effectively. The evaluation of the model’s performance involved the ISIC2018 challenge dataset. Furthermore, data augmentation techniques augmented training dataset size and enhanced model performance. Experimental results highlight the superiority of the SkinSwinViT method, achieving notable metrics of accuracy, recall, precision, specificity, and F1 score at 97.88%, 97.55%, 97.83%, 99.36%, and 97.79%, respectively.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords