Enhancing Small Medical Dataset Classification Performance Using GAN
Mohammad Alauthman,
Ahmad Al-qerem,
Bilal Sowan,
Ayoub Alsarhan,
Mohammed Eshtay,
Amjad Aldweesh,
Nauman Aslam
Affiliations
Mohammad Alauthman
Department of Information Security, Faculty of Information Technology, University of Petra, Amman 11196, Jordan
Ahmad Al-qerem
Computer Science Department, Faculty of Information Technology, Zarqa University, Zarqa 13110, Jordan
Bilal Sowan
Department of Business Intelligence and Data Analytics, University of Petra, Amman 11196, Jordan
Ayoub Alsarhan
Department of Information Technology, Faculty of Prince Al-Hussein Bin Abdallah II for Information Technology, The Hashemite University, Zarqa 13133, Jordan
Mohammed Eshtay
Abdul Aziz Al Ghurair School of Advanced Computing (ASAC), Luminus Technical University, Amman 11118, Jordan
Amjad Aldweesh
College of Computing and Information Technology, Shaqra University, Riyadh 11911, Saudi Arabia
Nauman Aslam
Department of Computer Science and Digital Technologies, Faculty of Engineering and Environment, Northumbria University, Newcastle upon Tyne NE1 8ST, UK
Developing an effective classification model in the medical field is challenging due to limited datasets. To address this issue, this study proposes using a generative adversarial network (GAN) as a data-augmentation technique. The research aims to enhance the classifier’s generalization performance, stability, and precision through the generation of synthetic data that closely resemble real data. We employed feature selection and applied five classification algorithms to thirteen benchmark medical datasets, augmented using the least-square GAN (LS-GAN). Evaluation of the generated samples using different ratios of augmented data showed that the support vector machine model outperforms other methods with larger samples. The proposed data augmentation approach using a GAN presents a promising solution for enhancing the performance of classification models in the healthcare field.