Frontiers in Medicine (Apr 2021)

A Deep Learning Based Framework for Diagnosing Multiple Skin Diseases in a Clinical Environment

  • Chen-Yu Zhu,
  • Yu-Kun Wang,
  • Hai-Peng Chen,
  • Kun-Lun Gao,
  • Chang Shu,
  • Jun-Cheng Wang,
  • Li-Feng Yan,
  • Yi-Guang Yang,
  • Feng-Ying Xie,
  • Jie Liu

DOI
https://doi.org/10.3389/fmed.2021.626369
Journal volume & issue
Vol. 8

Abstract

Read online

Background: Numerous studies have attempted to apply artificial intelligence (AI) in the dermatological field, mainly on the classification and segmentation of various dermatoses. However, researches under real clinical settings are scarce.Objectives: This study was aimed to construct a novel framework based on deep learning trained by a dataset that represented the real clinical environment in a tertiary class hospital in China, for better adaptation of the AI application in clinical practice among Asian patients.Methods: Our dataset was composed of 13,603 dermatologist-labeled dermoscopic images, containing 14 categories of diseases, namely lichen planus (LP), rosacea (Rosa), viral warts (VW), acne vulgaris (AV), keloid and hypertrophic scar (KAHS), eczema and dermatitis (EAD), dermatofibroma (DF), seborrheic dermatitis (SD), seborrheic keratosis (SK), melanocytic nevus (MN), hemangioma (Hem), psoriasis (Pso), port wine stain (PWS), and basal cell carcinoma (BCC). In this study, we applied Google's EfficientNet-b4 with pre-trained weights on ImageNet as the backbone of our CNN architecture. The final fully-connected classification layer was replaced with 14 output neurons. We added seven auxiliary classifiers to each of the intermediate layer groups. The modified model was retrained with our dataset and implemented using Pytorch. We constructed saliency maps to visualize our network's attention area of input images for its prediction. To explore the visual characteristics of different clinical classes, we also examined the internal image features learned by the proposed framework using t-SNE (t-distributed Stochastic Neighbor Embedding).Results: Test results showed that the proposed framework achieved a high level of classification performance with an overall accuracy of 0.948, a sensitivity of 0.934 and a specificity of 0.950. We also compared the performance of our algorithm with three most widely used CNN models which showed our model outperformed existing models with the highest area under curve (AUC) of 0.985. We further compared this model with 280 board-certificated dermatologists, and results showed a comparable performance level in an 8-class diagnostic task.Conclusions: The proposed framework retrained by the dataset that represented the real clinical environment in our department could accurately classify most common dermatoses that we encountered during outpatient practice including infectious and inflammatory dermatoses, benign and malignant cutaneous tumors.

Keywords