Jisuanji kexue yu tansuo (Jul 2021)

KNN Algorithm of Enhanced Clustering Based on Density Canopy and Deep Feature

  • SHEN Xueli, QIN Xinyu

DOI
https://doi.org/10.3778/j.issn.1673-9418.2004074
Journal volume & issue
Vol. 15, no. 7
pp. 1289 – 1301

Abstract

Read online

As the most widely used supervised classification algorithm, K nearest neighbor (KNN) algorithm is often inefficient in the processing of large-scale and multidimensional data. Therefore, an improved KNN algorithm for high dimensional large data processing is proposed. Firstly, deep neural networks (DNN) is used as feature extractor and dimension reduction is carried out to learn the most appropriate representation form of depth feature. Then, the appropriate number of clusters and the initial clustering center, obtained through the density Canopy algorithm become the input parameters of the subsequent K-means clustering. Finally, the learned data are clustered, and Hashing strategy in approximate similarity search (ASS) is used to cluster partitioning according to its approximate similarity, and the result is token as a new training sample of KNN classifier. In addition, considering that the nearest neighbor samples to be searched may fall in different clusters and the performance of KNN search is reduced, an additional clustering enhancement strategy is adopted in clustering, which effectively alleviates this situation. Five different data sets are used for comparison test. The results show that, compared with the experimental algorithms, this algorithm can not only greatly improve the accuracy of KNN classification, but also effectively improves the classification efficiency of the algorithm, reduces the distance required for searching, and has good robustness for noise data.

Keywords