FFA-GPT: an automated pipeline for fundus fluorescein angiography interpretation and question-answer

Xiaolan Chen; Weiyi Zhang; Pusheng Xu; Ziwei Zhao; Yingfeng Zheng; Danli Shi; Mingguang He

doi:10.1038/s41746-024-01101-z

npj Digital Medicine (May 2024)

FFA-GPT: an automated pipeline for fundus fluorescein angiography interpretation and question-answer

Xiaolan Chen,
Weiyi Zhang,
Pusheng Xu,
Ziwei Zhao,
Yingfeng Zheng,
Danli Shi,
Mingguang He

Affiliations

Xiaolan Chen: School of Optometry, The Hong Kong Polytechnic University
Weiyi Zhang: School of Optometry, The Hong Kong Polytechnic University
Pusheng Xu: State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases
Ziwei Zhao: School of Optometry, The Hong Kong Polytechnic University
Yingfeng Zheng: State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases
Danli Shi: School of Optometry, The Hong Kong Polytechnic University
Mingguang He: School of Optometry, The Hong Kong Polytechnic University

DOI: https://doi.org/10.1038/s41746-024-01101-z
Journal volume & issue: Vol. 7, no. 1
pp. 1 – 9

Abstract

Read online

Abstract Fundus fluorescein angiography (FFA) is a crucial diagnostic tool for chorioretinal diseases, but its interpretation requires significant expertise and time. Prior studies have used Artificial Intelligence (AI)-based systems to assist FFA interpretation, but these systems lack user interaction and comprehensive evaluation by ophthalmologists. Here, we used large language models (LLMs) to develop an automated interpretation pipeline for both report generation and medical question-answering (QA) for FFA images. The pipeline comprises two parts: an image-text alignment module (Bootstrapping Language-Image Pre-training) for report generation and an LLM (Llama 2) for interactive QA. The model was developed using 654,343 FFA images with 9392 reports. It was evaluated both automatically, using language-based and classification-based metrics, and manually by three experienced ophthalmologists. The automatic evaluation of the generated reports demonstrated that the system can generate coherent and comprehensible free-text reports, achieving a BERTScore of 0.70 and F1 scores ranging from 0.64 to 0.82 for detecting top-5 retinal conditions. The manual evaluation revealed acceptable accuracy (68.3%, Kappa 0.746) and completeness (62.3%, Kappa 0.739) of the generated reports. The generated free-form answers were evaluated manually, with the majority meeting the ophthalmologists’ criteria (error-free: 70.7%, complete: 84.0%, harmless: 93.7%, satisfied: 65.3%, Kappa: 0.762–0.834). This study introduces an innovative framework that combines multi-modal transformers and LLMs, enhancing ophthalmic image interpretation, and facilitating interactive communications during medical consultation.

Published in npj Digital Medicine

ISSN: 2398-6352 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://www.nature.com/npjdigitalmed/

About the journal