IEEE Access (Jan 2024)
Cancer Disease Prediction Using Integrated Smart Data Augmentation and Capsule Neural Network
Abstract
Cancer accounts for a considerable portion of the illnesses that cause early human death worldwide, and this trend is expected to worsen in the coming years. Therefore, timely and precise identification would be tremendously helpful for cancer patients. Gene expression datasets are commonly utilized for disease identification, particularly in cancer therapy. Deep learning (DL) has become a popular technique in healthcare because of the abundance of computational capacity. The gene expression data samples for five types of cancer disease and healthy samples are collected, but the samples in the gene data are insufficient to fulfill the deep learning requirements. To increase the training sample size, data augmentation is frequently used. The main objective of this research is the diagnosis and classification of different types of cancer. In this research, correlation-centered feature selection and reduction are used to select the most relevant features from the large volume of gene information. The proposed method is a smart data augmentation process with the CapsNet (Capsule Neural Network) method for the accurate prediction and classification of cancer diseases. The proposed augmentation strategy integrates Uniform Distributive Augmentation (UDA) and a Wasserstein-Generative Adversarial Network (W-GAN). The synthetic data samples are generated using uniform distribution and Wasserstein distance, and the newly evolved datasets are employed to train CapsNet. Then, the practical outcome of the integrated smart data augmentation with CapsNet is compared with other DL methods. As a result, the proposed method enhances the classification accuracy, precision, and recall values (>98%) and reduces the error rate.
Keywords