A two-stage multi-scale attention-based network for weakly supervised cataract fundus image enhancement

Xiaoyong Fang; Yue Wang; Xiangyu Li; Wanshu Fan; Dongsheng Zhou

doi:10.1038/s41598-025-12157-6

Scientific Reports (Jul 2025)

A two-stage multi-scale attention-based network for weakly supervised cataract fundus image enhancement

Xiaoyong Fang,
Yue Wang,
Xiangyu Li,
Wanshu Fan,
Dongsheng Zhou

Affiliations

Xiaoyong Fang: Department, School of Safety and Management Engineering, Hunan Institute of Technology
Yue Wang: National and Local Joint Engineering Laboratory of Computer Aided Design, School of Software Engineering
Xiangyu Li: National and Local Joint Engineering Laboratory of Computer Aided Design, School of Software Engineering
Wanshu Fan: National and Local Joint Engineering Laboratory of Computer Aided Design, School of Software Engineering
Dongsheng Zhou: National and Local Joint Engineering Laboratory of Computer Aided Design, School of Software Engineering

DOI: https://doi.org/10.1038/s41598-025-12157-6
Journal volume & issue: Vol. 15, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Cataract is a major cause of vision loss and hinders further diagnosis. However, enhancing cataract fundus images remains challenging due to limited paired cataract retinal images and the difficulty of recovering fine details in the retinal images. To mitigate these challenges, we in this paper propose a two-stage multi-scale attention-based network (TSMSA-Net) for weakly supervised cataract fundus image enhancement. In Stage 1, we introduce a real-like cataract fundus image synthesis module, which utilizes domain transformation via CycleGAN to generate realistic paired cataract images from unpaired clear and cataract fundus images, thus alleviating the scarcity of paired training data. In Stage 2, we employ a multi-scale attention-based enhancement module, which incorporates hierarchical attention mechanisms to extract rich, fine-grained features from the degraded images under weak supervision, effectively restoring image details and reducing artifacts. Experiments conducted on the Kaggle and ODIR-5K datasets show that TSMSA-Net outperforms existing state-of-the-art methods for cataract fundus image enhancement, even without paired images, and demonstrates strong generalization ability. Moreover, the enhanced images contribute to improved performance in downstream tasks such as vessel segmentation and disease classification.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal

Abstract

Keywords