A Nature-Inspired Partial Distance-Based Clustering Algorithm

Mohammed El Habib Kahla; Mounir Beggas; Abdelkader Laouid; Mohammad Hammoudeh

doi:10.3390/jsan13040036

Journal of Sensor and Actuator Networks (Jun 2024)

A Nature-Inspired Partial Distance-Based Clustering Algorithm

Mohammed El Habib Kahla,
Mounir Beggas,
Abdelkader Laouid,
Mohammad Hammoudeh

Affiliations

Mohammed El Habib Kahla: LIAP Laboratory, University of El Oued, P.O. Box 789, El Oued 39000, Algeria
Mounir Beggas: LIAP Laboratory, University of El Oued, P.O. Box 789, El Oued 39000, Algeria
Abdelkader Laouid: LIAP Laboratory, University of El Oued, P.O. Box 789, El Oued 39000, Algeria
Mohammad Hammoudeh: Information & Computer Science Department, King Fahd University of Petroleum & Minerals, Academic Belt Road, Dhahran 31261, Saudi Arabia

DOI: https://doi.org/10.3390/jsan13040036
Journal volume & issue: Vol. 13, no. 4
p. 36

Abstract

Read online

In the rapidly advancing landscape of digital technologies, clustering plays a critical role in the domains of artificial intelligence and big data. Clustering is essential for extracting meaningful insights and patterns from large, intricate datasets. Despite the efficacy of traditional clustering techniques in handling diverse data types and sizes, they encounter challenges posed by the increasing volume and dimensionality of data, as well as the complex structures inherent in high-dimensional spaces. This research recognizes the constraints of conventional clustering methods, including sensitivity to initial centroids, dependence on prior knowledge of cluster counts, and scalability issues, particularly in large datasets and Internet of Things implementations. In response to these challenges, we propose a K-level clustering algorithm inspired by the collective behavior of fish locomotion. K-level introduces a novel clustering approach based on greedy merging driven by distances in stages. This iterative process efficiently establishes hierarchical structures without the need for exhaustive computations. K-level gives users enhanced control over computational complexity, enabling them to specify the number of clusters merged simultaneously. This flexibility ensures accurate and efficient hierarchical clustering across diverse data types, offering a scalable solution for processing extensive datasets within a reasonable timeframe. The internal validation metrics, including the Silhouette Score, Davies–Bouldin Index, and Calinski–Harabasz Index, are utilized to evaluate the K-level algorithm across various types of datasets. Additionally, comparisons are made with rivals in the literature, including UPGMA, CLINK, UPGMC, SLINK, and K-means. The experiments and analyses show that the proposed algorithm overcomes many of the limitations of existing clustering methods, presenting scalable and adaptable clustering in the dynamic landscape of evolving data challenges.

Published in Journal of Sensor and Actuator Networks

ISSN: 2224-2708 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology
Website: https://www.mdpi.com/journal/jsan

About the journal

Abstract

Keywords