IEEE Access (Jan 2021)

A Multi-View Clustering Algorithm for Mixed Numeric and Categorical Data

  • Jinchao Ji,
  • Ruonan Li,
  • Wei Pang,
  • Fei He,
  • Guozhong Feng,
  • Xiaowei Zhao

DOI
https://doi.org/10.1109/ACCESS.2021.3057113
Journal volume & issue
Vol. 9
pp. 24913 – 24924

Abstract

Read online

Clustering data with both numeric and categorical attributes is of great importance as such data are ubiquitous in real-world problems. Multi-view learning approaches have proven to be more effective and having better generalisation ability compared to single-view learning in many problems. However, most of the existing clustering algorithms developed for mixed numeric and categorical data are single-view. In this research, we propose a novel multi-view clustering algorithm based on the k-prototypes (which we term Multi-view K-Prototypes) for clustering mixed data. To the best of our knowledge, our proposed Multi-view K-Prototypes is the first multi-view version of the well-known k-prototypes algorithm. To cluster the mixed data over multiple views, we present a novel representation prototype of cluster centres in the scenario of multiple views, and we also devise formulas for updating the cluster centres over each view. Then we propose the concept of consensus cluster centres to output the final clustering result. Finally, we carried out a series of experiments on four benchmark datasets to assess the performance of the proposed Multi-view K-Prototypes clustering. Experimental results show that the Multi-view K-Prototypes algorithm outperforms the seven state-of-the-art algorithms in most cases.

Keywords