IEEE Access (Jan 2020)
A New Cluster Validity Index Based on the Adjustment of Within-Cluster Distance
Abstract
The evaluation on clustering results is an important component of clustering analysis, which can be conducted by the cluster validity index. However, the performances of most existing indices depend on not only the specific clustering algorithms but also the measurements of within- and between- cluster distances and data structures, resulting in limited applications in practice. In this paper, a new within-cluster distance under a general assumption is defined first. After adjusting within-cluster distances of each point according to the adjustment rule, a novel cluster validity index is proposed. Moreover, the notion of chain is introduced to eliminate the effects of sizes, densities, and shapes of clusters. This index does not need any prior information about clustering algorithms and is independent of data structures. Two groups of synthetic datasets with various characteristics and real-world datasets are used to validate this proposed validity index. Experimental results demonstrate that the evaluation accuracy of this index is higher than that of the existing typical indices and performs well on datasets with irregular-shaped clusters.
Keywords