Algorithms (Aug 2024)

Optimization of Gene Selection for Cancer Classification in High-Dimensional Data Using an Improved African Vultures Algorithm

  • Mona G. Gafar,
  • Amr A. Abohany,
  • Ahmed E. Elkhouli,
  • Amr A. Abd El-Mageed

DOI
https://doi.org/10.3390/a17080342
Journal volume & issue
Vol. 17, no. 8
p. 342

Abstract

Read online

This study presents a novel method, termed RBAVO-DE (Relief Binary African Vultures Optimization based on Differential Evolution), aimed at addressing the Gene Selection (GS) challenge in high-dimensional RNA-Seq data, specifically the rnaseqv2 lluminaHiSeq rnaseqv2 un edu Level 3 RSEM genes normalized dataset, which contains over 20,000 genes. RNA Sequencing (RNA-Seq) is a transformative approach that enables the comprehensive quantification and characterization of gene expressions, surpassing the capabilities of micro-array technologies by offering a more detailed view of RNA-Seq gene expression data. Quantitative gene expression analysis can be pivotal in identifying genes that differentiate normal from malignant tissues. However, managing these high-dimensional dense matrix data presents significant challenges. The RBAVO-DE algorithm is designed to meticulously select the most informative genes from a dataset comprising more than 20,000 genes and assess their relevance across twenty-two cancer datasets. To determine the effectiveness of the selected genes, this study employs the Support Vector Machine (SVM) and k-Nearest Neighbor (k-NN) classifiers. Compared to binary versions of widely recognized meta-heuristic algorithms, RBAVO-DE demonstrates superior performance. According to Wilcoxon’s rank-sum test, with a 5% significance level, RBAVO-DE achieves up to 100% classification accuracy and reduces the feature size by up to 98% in most of the twenty-two cancer datasets examined. This advancement underscores the potential of RBAVO-DE to enhance the precision of gene selection for cancer research, thereby facilitating more accurate and efficient identification of key genetic markers.

Keywords