Scientific Reports (Jul 2025)

A two-stage multi-scale attention-based network for weakly supervised cataract fundus image enhancement

  • Xiaoyong Fang,
  • Yue Wang,
  • Xiangyu Li,
  • Wanshu Fan,
  • Dongsheng Zhou

DOI
https://doi.org/10.1038/s41598-025-12157-6
Journal volume & issue
Vol. 15, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Cataract is a major cause of vision loss and hinders further diagnosis. However, enhancing cataract fundus images remains challenging due to limited paired cataract retinal images and the difficulty of recovering fine details in the retinal images. To mitigate these challenges, we in this paper propose a two-stage multi-scale attention-based network (TSMSA-Net) for weakly supervised cataract fundus image enhancement. In Stage 1, we introduce a real-like cataract fundus image synthesis module, which utilizes domain transformation via CycleGAN to generate realistic paired cataract images from unpaired clear and cataract fundus images, thus alleviating the scarcity of paired training data. In Stage 2, we employ a multi-scale attention-based enhancement module, which incorporates hierarchical attention mechanisms to extract rich, fine-grained features from the degraded images under weak supervision, effectively restoring image details and reducing artifacts. Experiments conducted on the Kaggle and ODIR-5K datasets show that TSMSA-Net outperforms existing state-of-the-art methods for cataract fundus image enhancement, even without paired images, and demonstrates strong generalization ability. Moreover, the enhanced images contribute to improved performance in downstream tasks such as vessel segmentation and disease classification.

Keywords