Deep learning assisted cancer disease prediction from gene expression data using WT-GAN

U. Ravindran; C. Gunavathi

doi:10.1186/s12911-024-02712-y

BMC Medical Informatics and Decision Making (Oct 2024)

Deep learning assisted cancer disease prediction from gene expression data using WT-GAN

U. Ravindran,
C. Gunavathi

Affiliations

U. Ravindran: School of Computer Science Engineering and Information Systems, Vellore Institute of Technology
C. Gunavathi: School of Computer Science and Engineering, Vellore Institute of Technology

DOI: https://doi.org/10.1186/s12911-024-02712-y
Journal volume & issue: Vol. 24, no. 1
pp. 1 – 17

Abstract

Read online

Abstract Several diverse fields including the healthcare system and drug development sectors have benefited immensely through the adoption of deep learning (DL), which is a subset of artificial intelligence (AI) and machine learning (ML). Cancer makes up a significant percentage of the illnesses that cause early human mortality across the globe, and this situation is likely to rise in the coming years, especially when non-communicable illnesses are not considered. As a result, cancer patients would greatly benefit from precise and timely diagnosis and prediction. Deep learning (DL) has become a common technique in healthcare due to the abundance of computational power. Gene expression datasets are frequently used in major DL-based applications for illness detection, notably in cancer therapy. The quantity of medical data, on the other hand, is often insufficient to fulfill deep learning requirements. Microarray gene expression datasets are used for training procedures despite their extreme dimensionality, limited volume of data samples, and sparsely available information. Data augmentation is commonly used to expand the training sample size for gene data. The Wasserstein Tabular Generative Adversarial Network (WT-GAN) model is used for the data augmentation process for generating synthetic data in this proposed work. The correlation-based feature selection technique selects the most relevant characteristics based on threshold values. Deep FNN and ML algorithms train and classify the gene expression samples. The augmented data give better classification results (> 97%) when using WT-GAN for cancer diagnosis.

Published in BMC Medical Informatics and Decision Making

ISSN: 1472-6947 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: http://bmcmedinformdecismak.biomedcentral.com

About the journal

Abstract

Keywords