A binary biclustering algorithm based on the adjacency difference matrix for gene expression data analysis

He-Ming Chu; Jin-Xing Liu; Ke Zhang; Chun-Hou Zheng; Juan Wang; Xiang-Zhen Kong

doi:10.1186/s12859-022-04842-4

BMC Bioinformatics (Sep 2022)

A binary biclustering algorithm based on the adjacency difference matrix for gene expression data analysis

He-Ming Chu,
Jin-Xing Liu,
Ke Zhang,
Chun-Hou Zheng,
Juan Wang,
Xiang-Zhen Kong

Affiliations

He-Ming Chu: School of Computer Science, Qufu Normal University
Jin-Xing Liu: School of Computer Science, Qufu Normal University
Ke Zhang: Department of Oncology, Rizhao People’s Hospital
Chun-Hou Zheng: School of Computer Science, Qufu Normal University
Juan Wang: School of Computer Science, Qufu Normal University
Xiang-Zhen Kong: School of Computer Science, Qufu Normal University

DOI: https://doi.org/10.1186/s12859-022-04842-4
Journal volume & issue: Vol. 23, no. 1
pp. 1 – 16

Abstract

Read online

Abstract Biclustering algorithm is an effective tool for processing gene expression datasets. There are two kinds of data matrices, binary data and non-binary data, which are processed by biclustering method. A binary matrix is usually converted from pre-processed gene expression data, which can effectively reduce the interference from noise and abnormal data, and is then processed using a biclustering algorithm. However, biclustering algorithms of dealing with binary data have a poor balance between running time and performance. In this paper, we propose a new biclustering algorithm called the Adjacency Difference Matrix Binary Biclustering algorithm (AMBB) for dealing with binary data to address the drawback. The AMBB algorithm constructs the adjacency matrix based on the adjacency difference values, and the submatrix obtained by continuously updating the adjacency difference matrix is called a bicluster. The adjacency matrix allows for clustering of gene that undergo similar reactions under different conditions into clusters, which is important for subsequent genes analysis. Meanwhile, experiments on synthetic and real datasets visually demonstrate that the AMBB algorithm has high practicability.

Published in BMC Bioinformatics

ISSN: 1471-2105 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Biology (General)
Website: http://www.biomedcentral.com/bmcbioinformatics/

About the journal

Abstract

Keywords