IEEE Access (Jan 2020)

Clustering Mixed Numeric and Categorical Data With Cuckoo Search

  • Jinchao Ji,
  • Wei Pang,
  • Zairong Li,
  • Fei He,
  • Guozhong Feng,
  • Xiaowei Zhao

DOI
https://doi.org/10.1109/ACCESS.2020.2973216
Journal volume & issue
Vol. 8
pp. 30988 – 31003

Abstract

Read online

Clustering analysis, as an important technique in data mining, aims to identify the nature groups or clusters of data objects in the attribute space. Data objects in real-world applications are commonly described by both numeric and categorical attributes. In this research, considering that the partitional clustering algorithms designed for this type of mixed data are prone to get trapped into local optima and the cuckoo search approach is efficient in solving global optimization problems, we propose CCS-K-Prototypes, a novel partitional Clustering algorithm based on Cuckoo Search and K-Prototypes, for clustering mixed numeric and categorical data. To deal with different types of attributes, we develop a novel representation for candidate solutions, and suggest two formulas for the cuckoo to search for the potential solution around the existing solutions or in the entire attribute space. Finally, the performance of the proposed algorithm is assessed by a series of experiments on five benchmark datasets.

Keywords