Journal of King Saud University: Computer and Information Sciences (May 2022)

Ensemble Neighborhood Search (ENS) for biclustering of gene expression microarray data and single cell RNA sequencing data

  • Bhawani Sankar Biswal,
  • Anjali Mohapatra,
  • Swati Vipsita

Journal volume & issue
Vol. 34, no. 5
pp. 2244 – 2251

Abstract

Read online

Background: Ensemble biclustering comprises a class of biclustering algorithms that generates a consensus, better-quality partition/s as output. This concept has emerged from the fusion of existing biclustering methods hybridized upon selected aspects. The design of the methodology enriches the existing methods furnishing with new properties. Usually biclustering of gene expression microarray data indulges in simultaneous clustering of the expression profiles under specific conditions and determines local two-way clustering models. In general, biclustering solutions rely upon different parameters like biclusters numbers, random initialization etc. However ensemble techniques are proposed to either reduce or eliminate the impact of such parameters on the output bicluster. Methods: In this paper, the authors propose a novel ensemble biclustering approach “Ensemble Neighborhood search (ENS)” based on the concept of neighborhood search. Simulation results verify that the proposed approach appears to be more flexible and adaptive in comparison to the existing competitive methods on high-dimensional gene expression microarray data as well as on scRNA-seq datasets. Conclusion: The performance of the proposed framework demonstrates its effectiveness with the other state-of-the-art schemes. The proposed framework is tested on five different microarray datasets and one single cell RNA sequence(scRNA-seq) dataset. Experimental results reveal that the proposed architecture achieves the prevention of unusual data loss and delivers the output refined as the per user standards. Also this framework preforms effectively on high sparsity scRNA-seq data where most of the algorithms fail to do so as these datasets contain massive zeros within. BicAT analysis of the ENS output validates ENS method as computationally effective and can be used to improve the quality of the biclusters. Finally, the results are statistically significant as shown in the ANOVA table. Hence this ENS method can be considered as a reliable framework and can be preferable over the traditional biclustering approaches to analyze the gene expression microarray data and high sparsity scRNA-seq data. The source code of the ENS algorithm can be accessed at https://github.com/c114002/Research/blob/master/ENS_Code.zip.

Keywords