IEEE Access (Jan 2023)

Attention-Aided Generative Learning for Multi-Scale Multi-Modal Fundus Image Translation

  • Van-Nguyen Pham,
  • Duc-Tai Le,
  • Junghyun Bum,
  • Eun Jung Lee,
  • Jong Chul Han,
  • Hyunseung Choo

DOI
https://doi.org/10.1109/ACCESS.2023.3278596
Journal volume & issue
Vol. 11
pp. 51701 – 51711

Abstract

Read online

Conventional fundus images (CFIs) and ultra-widefield fundus images (UFIs) are two fundamental image modalities in ophthalmology. While CFIs provide a detailed view of the optic nerve head and the posterior pole of an eye, their clinical use is associated with high costs and patient inconvenience due to the requirement of good pupil dilation. On the other hand, UFIs capture peripheral lesions, but their image quality is sensitive to factors such as pupil size, eye position, and eyelashes, leading to greater variability between examinations compared to CFIs. The widefield retina view of UFIs offers the theoretical possibility of generating CFIs from available UFIs to reduce patient examination costs. A recent study has shown the feasibility of this approach by leveraging deep learning techniques for the UFI-to-CFI translation task. However, the technique suffers from the heterogeneous scales of the image modalities and variations in the brightness of the training data. In this paper, we address these issues with a novel framework consisting of three stages: cropping, enhancement, and translation. The first stage is an optic disc-centered cropping strategy that helps to alleviate the scale difference between the two image domains. The second stage mitigates the variation in training data brightness and unifies the mask between the two modalities. In the last stage, we introduce an attention-aided generative learning model to translate a given UFI into the CFI domain. Our experimental results demonstrate the success of the proposed method on 1,011 UFIs, with 99.8% of the generated CFIs evaluated as good quality and usable. Expert evaluations confirm significant visual quality improvements in the generated CFIs compared to the UFIs, ranging from 10% to 80% for features such as optic nerve structure, vascular distribution, and drusen. Furthermore, using generated CFIs in an AI-based diagnosis system for age-related macular degeneration results in superior accuracy compared to UFIs and competitive performance relative to real CFIs. These results showcase the potential of our approach for automatic disease diagnosis and monitoring.

Keywords