Automatika (Jan 2024)
Spatial clustering based gene selection for gene expression analysis in microarray data classification
Abstract
A typical application of categorization in data mining is to uncover interesting distributions and significant patterns in the information that underlies it using density-based spatial clustering for workloads with noise. In these conditions, it is anticipated that the classification of the microarray gene expression database will have the necessary clustering property that may be utilized to emphasize the effects of the alterations. The proposed method typically guarantees that the subsequent identification of gene clusters’ best global arrangement of genes. It provides an iterative method for figuring out the precise number of clusters needed for each data collection. The technique is based on practices frequently used in statistical tests. The key idea is to coordinate gene redistribution optimization across clusters with the search for the optimal number of groups. An experiment that finds the most effective number of genes over time was used to evaluate the effectiveness of the suggested strategy. It used this stringent statistical test to show that our technique accurately clusters more than 95% of the genes. Finally, since the basic principles of gene development and gene cluster assignment have been well characterized by earlier studies and the technique was verified using real gene expression information.
Keywords