Clustering by Detecting Density Peaks and Assigning Points by Similarity-First Search Based on Weighted K-Nearest Neighbors Graph

Qi Diao; Yaping Dai; Qichao An; Weixing Li; Xiaoxue Feng; Feng Pan

doi:10.1155/2020/1731075

Complexity (Jan 2020)

Clustering by Detecting Density Peaks and Assigning Points by Similarity-First Search Based on Weighted K-Nearest Neighbors Graph

Qi Diao,
Yaping Dai,
Qichao An,
Weixing Li,
Xiaoxue Feng,
Feng Pan

Affiliations

Qi Diao: Beijing Institute of Technology, School of Automation, Beijing 100081, China
Yaping Dai: Beijing Institute of Technology, School of Automation, Beijing 100081, China
Qichao An: Beijing Institute of Technology, School of Automation, Beijing 100081, China
Weixing Li: Beijing Institute of Technology, School of Automation, Beijing 100081, China
Xiaoxue Feng: Beijing Institute of Technology, School of Automation, Beijing 100081, China
Feng Pan: Beijing Institute of Technology, School of Automation, Beijing 100081, China

DOI: https://doi.org/10.1155/2020/1731075
Journal volume & issue: Vol. 2020

Abstract

Read online

This paper presents an improved clustering algorithm for categorizing data with arbitrary shapes. Most of the conventional clustering approaches work only with round-shaped clusters. This task can be accomplished by quickly searching and finding clustering methods for density peaks (DPC), but in some cases, it is limited by density peaks and allocation strategy. To overcome these limitations, two improvements are proposed in this paper. To describe the clustering center more comprehensively, the definitions of local density and relative distance are fused with multiple distances, including K-nearest neighbors (KNN) and shared-nearest neighbors (SNN). A similarity-first search algorithm is designed to search the most matching cluster centers for noncenter points in a weighted KNN graph. Extensive comparison with several existing DPC methods, e.g., traditional DPC algorithm, density-based spatial clustering of applications with noise (DBSCAN), affinity propagation (AP), FKNN-DPC, and K-means methods, has been carried out. Experiments based on synthetic data and real data show that the proposed clustering algorithm can outperform DPC, DBSCAN, AP, and K-means in terms of the clustering accuracy (ACC), the adjusted mutual information (AMI), and the adjusted Rand index (ARI).

Published in Complexity

ISSN: 1076-2787 (Print); 1099-0526 (Online)
Publisher: Hindawi-Wiley
Country of publisher: United Kingdom
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.hindawi.com/journals/complexity/

About the journal