Scientific Reports (Jun 2024)

Data imbalance in cardiac health diagnostics using CECG-GAN

  • Yang Yang,
  • Tianyu Lan,
  • Yang Wang,
  • Fengtian Li,
  • Liyan Liu,
  • Xupeng Huang,
  • Fei Gao,
  • Shuhua Jiang,
  • Zhijun Zhang,
  • Xing Chen

DOI
https://doi.org/10.1038/s41598-024-65619-8
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 16

Abstract

Read online

Abstract Heart disease is the world’s leading cause of death. Diagnostic models based on electrocardiograms (ECGs) are often limited by the scarcity of high-quality data and issues of data imbalance. To address these challenges, we propose a conditional generative adversarial network (CECG-GAN). This strategy enables the generation of samples that closely approximate the distribution of ECG data. Additionally, CECG-GAN addresses waveform jitter, slow processing speeds, and dataset imbalance issues through the integration of a transformer architecture. We evaluated this approach using two datasets: MIT-BIH and CSPC2020. The experimental results demonstrate that CECG-GAN achieves outstanding performance metrics. Notably, the percentage root mean square difference (PRD) reached 55.048, indicating a high degree of similarity between generated and actual ECG waveforms. Additionally, the Fréchet distance (FD) was approximately 1.139, the root mean square error (RMSE) registered at 0.232, and the mean absolute error (MAE) was recorded at 0.166.

Keywords