BMC Bioinformatics (Sep 2022)

A binary biclustering algorithm based on the adjacency difference matrix for gene expression data analysis

  • He-Ming Chu,
  • Jin-Xing Liu,
  • Ke Zhang,
  • Chun-Hou Zheng,
  • Juan Wang,
  • Xiang-Zhen Kong

DOI
https://doi.org/10.1186/s12859-022-04842-4
Journal volume & issue
Vol. 23, no. 1
pp. 1 – 16

Abstract

Read online

Abstract Biclustering algorithm is an effective tool for processing gene expression datasets. There are two kinds of data matrices, binary data and non-binary data, which are processed by biclustering method. A binary matrix is usually converted from pre-processed gene expression data, which can effectively reduce the interference from noise and abnormal data, and is then processed using a biclustering algorithm. However, biclustering algorithms of dealing with binary data have a poor balance between running time and performance. In this paper, we propose a new biclustering algorithm called the Adjacency Difference Matrix Binary Biclustering algorithm (AMBB) for dealing with binary data to address the drawback. The AMBB algorithm constructs the adjacency matrix based on the adjacency difference values, and the submatrix obtained by continuously updating the adjacency difference matrix is called a bicluster. The adjacency matrix allows for clustering of gene that undergo similar reactions under different conditions into clusters, which is important for subsequent genes analysis. Meanwhile, experiments on synthetic and real datasets visually demonstrate that the AMBB algorithm has high practicability.

Keywords