A Multi-Label Detection Deep Learning Model with Attention-Guided Image Enhancement for Retinal Images

Zhenwei Li; Mengying Xu; Xiaoli Yang; Yanqi Han; Jiawen Wang

doi:10.3390/mi14030705

Micromachines (Mar 2023)

A Multi-Label Detection Deep Learning Model with Attention-Guided Image Enhancement for Retinal Images

Zhenwei Li,
Mengying Xu,
Xiaoli Yang,
Yanqi Han,
Jiawen Wang

Affiliations

Zhenwei Li: College of Medical Technology and Engineering, Henan University of Science and Technology, Luoyang 471032, China
Mengying Xu: College of Medical Technology and Engineering, Henan University of Science and Technology, Luoyang 471032, China
Xiaoli Yang: College of Medical Technology and Engineering, Henan University of Science and Technology, Luoyang 471032, China
Yanqi Han: College of Medical Technology and Engineering, Henan University of Science and Technology, Luoyang 471032, China
Jiawen Wang: College of Medical Technology and Engineering, Henan University of Science and Technology, Luoyang 471032, China

DOI: https://doi.org/10.3390/mi14030705
Journal volume & issue: Vol. 14, no. 3
p. 705

Abstract

Read online

At present, multi-disease fundus image classification tasks still have the problems of small data volumes, uneven distributions, and low classification accuracy. In order to solve the problem of large data demand of deep learning models, a multi-disease fundus image classification ensemble model based on gradient-weighted class activation mapping (Grad-CAM) is proposed. The model uses VGG19 and ResNet50 as the classification networks. Grad-CAM is a data augmentation module used to obtain a network convolutional layer output activation map. Both the augmented and the original data are used as the input of the model to achieve the classification goal. The data augmentation module can guide the model to learn the feature differences of lesions in the fundus and enhance the robustness of the classification model. Model fine tuning and transfer learning are used to improve the accuracy of multiple classifiers. The proposed method is based on the RFMiD (Retinal Fundus Multi-Disease Image Dataset) dataset, and an ablation experiment was performed. Compared with other methods, the accuracy, precision, and recall of this model are 97%, 92%, and 81%, respectively. The resulting activation graph shows the areas of interest for model classification, making it easier to understand the classification network.

Published in Micromachines

ISSN: 2072-666X (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Mechanical engineering and machinery
Website: https://www.mdpi.com/journal/micromachines

About the journal

Abstract

Keywords