Automatic Report Generation for Chest X-Ray Images via Adversarial Reinforcement Learning

Daibing Hou; Zijian Zhao; Yuying Liu; Faliang Chang; Sanyuan Hu

doi:10.1109/ACCESS.2021.3056175

IEEE Access (Jan 2021)

Automatic Report Generation for Chest X-Ray Images via Adversarial Reinforcement Learning

Daibing Hou,
Zijian Zhao,
Yuying Liu,
Faliang Chang,
Sanyuan Hu

Affiliations

Daibing Hou: ORCiD; School of Control Science and Engineering, Shandong University, Jinan, China
Zijian Zhao: ORCiD; School of Control Science and Engineering, Shandong University, Jinan, China
Yuying Liu: ORCiD; School of Control Science and Engineering, Shandong University, Jinan, China
Faliang Chang: ORCiD; School of Control Science and Engineering, Shandong University, Jinan, China
Sanyuan Hu: Department of General Surgery, First Affiliated Hospital, Shandong First Medical University, Jinan, China

DOI: https://doi.org/10.1109/ACCESS.2021.3056175
Journal volume & issue: Vol. 9
pp. 21236 – 21250

Abstract

Read online

An adversarial reinforced report-generation framework for chest x-ray images is proposed. Previous medical-report-generation models are mostly trained by minimizing the cross-entropy loss or further optimizing the common image-captioning metrics, such as CIDEr, ignoring diagnostic accuracy, which should be the first consideration in this area. Inspired by the generative adversarial network, an adversarial reinforcement learning approach is proposed for report generation of chest x-ray images considering both diagnostic accuracy and language fluency. Specifically, an accuracy discriminator (AD) and fluency discriminator (FD) are built that serve as the evaluators by which a report based on these two aspects is scored. The FD checks how likely a report originates from a human expert, while the AD determines how much a report covers the key chest observations. The weighted score is viewed as a “reward” used for training the report generator via reinforcement learning, which solves the problem that the gradient cannot be passed back to the generative model when the output is discrete. Simultaneously, these two discriminators are optimized by maximum-likelihood estimation for better assessment ability. Additionally, a multi-type medical concept fused encoder followed by a hierarchical decoder is adopted as the report generator. Experiments on two large radiograph datasets demonstrate that the proposed model outperforms all methods to which it is compared.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords