Journal of Big Data (Feb 2021)

Optimized hybrid investigative based dimensionality reduction methods for malaria vector using KNN classifier

  • Micheal Olaolu Arowolo,
  • Marion Olubunmi Adebiyi,
  • Ayodele Ariyo Adebiyi,
  • Oludayo Olugbara

DOI
https://doi.org/10.1186/s40537-021-00415-z
Journal volume & issue
Vol. 8, no. 1
pp. 1 – 14

Abstract

Read online

Abstract RNA-Seq data are utilized for biological applications and decision making for the classification of genes. A lot of works in recent time are focused on reducing the dimension of RNA-Seq data. Dimensionality reduction approaches have been proposed in the transformation of these data. In this study, a novel optimized hybrid investigative approach is proposed. It combines an optimized genetic algorithm with Principal Component Analysis and Independent Component Analysis (GA-O-PCA and GAO-ICA), which are used to identify an optimum subset and latent correlated features, respectively. The classifier uses KNN on the reduced mosquito Anopheles gambiae dataset, to enhance the accuracy and scalability in the gene expression analysis. The proposed algorithm is used to fetch relevant features based on the high-dimensional input feature space. A fast algorithm for feature ranking is used to select relevant features. The performances of the model are evaluated and validated using the classification accuracy to compare existing approaches in the literature. The achieved experimental results prove to be promising for selecting relevant genes and classifying pertinent gene expression data analysis by indicating that the approach is capable of adding to prevailing machine learning methods.

Keywords