IEEE Access (Jan 2024)

Enhancing DBSCAN Clustering for Fingerprint-Based Localization With a Context Similarity Coefficient-Based Similarity Measure Metric

  • Abdulmalik Shehu Yaro,
  • Filip Maly,
  • Karel Maly,
  • Pavel Prazak

DOI
https://doi.org/10.1109/ACCESS.2024.3446674
Journal volume & issue
Vol. 12
pp. 117298 – 117307

Abstract

Read online

In fingerprint-based localization systems, clustering fingerprint databases is a proposed technique for improving localization accuracy while reducing localization time. Among various clustering algorithms, density-based spatial clustering of applications with noise (DBSCAN) stands out for its robustness to outliers and ability to accommodate fingerprint databases of various shapes. However, the clustering performance of the DBSCAN algorithm is heavily influenced by the type of similarity measure metric used, with most researchers using distance-based metrics. This paper aims to enhance DBSCAN clustering by using a pattern-based metric known as the context similarity coefficient (CSC) instead of distance-based metrics. The CSC metric examines received signal strength (RSS) measurement patterns that form fingerprint vectors and assesses both linear and non-linear relationships between these vectors to determine similarity. Four publicly available fingerprint databases were used to evaluate the clustering performance with silhouette scores as a performance metric. The performance of the DBSCAN algorithm with the CSC metric is determined and compared to Euclidean and Manhattan distances as similarity measure metrics. Simulation results indicate that achieving good clustering performance with the DBSCAN algorithm requires generating three or fewer clusters. The proposed CSC metric demonstrated the best clustering performance in two of four fingerprint databases and the second-best in another. However, computational complexity comparisons reveal that the CSC metric is highly computationally intensive and is suggested to be used on small to medium-sized fingerprint databases generated using an odd number of wireless APs deployed in a non-uniform or non-grid-like distribution.

Keywords