ICTACT Journal on Soft Computing (Jul 2022)
DESIGN OF CATEGORICAL DATA CLUSTERING USING MACHINE LEARNING ENSEMBLE
Abstract
Cluster analysis of data is a crucial tool for discovering and making sense of a dataset underlying structure. It has been put to use in many contexts and many different fields with great success. In addition, new innovations in the last decade have piqued the interest of clinical researchers, scientists, and biologists. As the number of dimensions in a data set grows, the consensus function of traditional ensemble clustering often fails to generate final clusters. The main problem with conventional ensemble clustering is exactly this. The proposed work employs a similarity measure between links to identify which clusters contain the unknown datasets. To this end, this study proposes employing an improved ensemble framework for clustering categorical datasets. More specifically, it employs ensemble machine learning methods to categorize data. Multiple machine learning algorithms are incorporated into this model. Objective performance indicators are used to compare a model to more traditional approaches to determine how effective each the proposed method is.
Keywords