IEEE Access (Jan 2023)

Enhanced Approach for Agglomerative Clustering Using Topological Relations

  • Hakam W. Alomari,
  • Amer F. Al-Badarneh,
  • Abdullah Al-Alaj,
  • Samer Y. Khamaiseh

DOI
https://doi.org/10.1109/ACCESS.2023.3252374
Journal volume & issue
Vol. 11
pp. 21945 – 21967

Abstract

Read online

Spatial data clustering has long been used to facilitate the knowledge discovery process. Several approaches have been proposed in the literature for detecting and understanding hidden patterns. These approaches are based on different perspectives and can be roughly categorized into several main categories, including centroid-based, density-based, grid-based, and hierarchy-based clustering. In spite of being a very mature research area, existing spatial clustering techniques usually depend on user parameters and continue to utilize distances between objects as their similarity measure. In turn, clustering approaches are generally suffering from performance and scalability issues. To address this problem, we propose ACUTE, an efficient and scalable approach that detects both synthetic and real-world spatial clusters. ACUTE ascertains both the intra-cluster compactness (similarities) and inter-cluster connectedness (dissimilarities) of spatial objects by assessing the topological relations of their corresponding spatial points. While conventional methods conduct clustering according to pairwise comparisons of the distances between objects, our approach focuses on leveraging topological relations that reduce the distances required to be calculated. This in turn minimizes the number of comparisons required, thus enhancing the efficiency and increasing the scalability. To evaluate the accuracy of ACUTE, it has been extensively tested against twelve (12) synthetic datasets and five (5) actual datasets including one location-based (network) dataset. Results show that ACUTE has great performance when compared with state-of-the-art clustering techniques in terms of several evaluation metrics, including precision, recall, and error rates.

Keywords