Autoencoder-based conditional optimal transport generative adversarial network for medical image generation

Jun Wang; Bohan Lei; Liya Ding; Xiaoyin Xu; Xianfeng Gu; Min Zhang

Visual Informatics (Mar 2024)

Autoencoder-based conditional optimal transport generative adversarial network for medical image generation

Jun Wang,
Bohan Lei,
Liya Ding,
Xiaoyin Xu,
Xianfeng Gu,
Min Zhang

Affiliations

Jun Wang: School of Software Technology, Zhejiang University, Ningbo, Zhejiang, China
Bohan Lei: College of Computer Science & Technology, Zhejiang University, Hangzhou, Zhejiang, China
Liya Ding: Department of Pathology, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
Xiaoyin Xu: College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou, Zhejiang, China
Xianfeng Gu: Department of Computer Science, State University of New York at Stony Brook, Stony Brook, NY, USA
Min Zhang: College of Computer Science & Technology, Zhejiang University, Hangzhou, Zhejiang, China; Corresponding author.

Journal volume & issue: Vol. 8, no. 1
pp. 15 – 25

Abstract

Read online

Medical image generation has recently garnered significant interest among researchers. However, the primary generative models, such as Generative Adversarial Networks (GANs), often encounter challenges during training, including mode collapse. To address these issues, we proposed the AE-COT-GAN model (Autoencoder-based Conditional Optimal Transport Generative Adversarial Network) for the generation of medical images belonging to specific categories. The training process of our model comprises three fundamental components. The training process of our model encompasses three fundamental components. First, we employ an autoencoder model to obtain a low-dimensional manifold representation of real images. Second, we apply extended semi-discrete optimal transport to map Gaussian noise distribution to the latent space distribution and obtain corresponding labels effectively. This procedure leads to the generation of new latent codes with known labels. Finally, we integrate a GAN to train the decoder further to generate medical images. To evaluate the performance of the AE-COT-GAN model, we conducted experiments on two medical image datasets, namely DermaMNIST and BloodMNIST. The model’s performance was compared with state-of-the-art generative models. Results show that the AE-COT-GAN model had excellent performance in generating medical images. Moreover, it effectively addressed the common issues associated with traditional GANs.

Published in Visual Informatics

ISSN: 2468-502X (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: https://www.journals.elsevier.com/visual-informatics/

About the journal

Abstract

Keywords