Enhancement of K-means clustering in big data based on equilibrium optimizer algorithm

Al-kababchee Sarah Ghanim Mahmood; Algamal Zakariya Yahya; Qasim Omar Saber

doi:10.1515/jisys-2022-0230

Journal of Intelligent Systems (Feb 2023)

Enhancement of K-means clustering in big data based on equilibrium optimizer algorithm

Al-kababchee Sarah Ghanim Mahmood,
Algamal Zakariya Yahya,
Qasim Omar Saber

Affiliations

Al-kababchee Sarah Ghanim Mahmood: Department of Mathematics, University of Mosul, 41002 Mosul, Iraq
Algamal Zakariya Yahya: Department of Statistics and Informatics, University of Mosul, 41002 Mosul, Iraq
Qasim Omar Saber: Department of Mathematics, University of Mosul, 41002 Mosul, Iraq

DOI: https://doi.org/10.1515/jisys-2022-0230
Journal volume & issue: Vol. 32, no. 1
pp. 99 – 106

Abstract

Read online

Data mining’s primary clustering method has several uses, including gene analysis. A set of unlabeled data is divided into clusters using data features in a clustering study, which is an unsupervised learning problem. Data in a cluster are more comparable to one another than to those in other groups. However, the number of clusters has a direct impact on how well the K-means algorithm performs. In order to find the best solutions for these real-world optimization issues, it is necessary to use techniques that properly explore the search spaces. In this research, an enhancement of K-means clustering is proposed by applying an equilibrium optimization approach. The suggested approach adjusts the number of clusters while simultaneously choosing the best attributes to find the optimal answer. The findings establish the usefulness of the suggested method in comparison to existing algorithms in terms of intra-cluster distances and Rand index based on five datasets. Through the results shown and a comparison of the proposed method with the rest of the traditional methods, it was found that the proposal is better in terms of the internal dimension of the elements within the same cluster, as well as the Rand index. In conclusion, the suggested technique can be successfully employed for data clustering and can offer significant support.

Published in Journal of Intelligent Systems

ISSN: 0334-1860 (Print); 2191-026X (Online)
Publisher: De Gruyter
Country of publisher: Poland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.degruyter.com/view/journals/jisys/jisys-overview.xml

About the journal

Abstract

Keywords