Data Science and Engineering (Jun 2019)

Estimating the Optimal Number of Clusters k in a Dataset Using Data Depth

  • Channamma Patil,
  • Ishwar Baidari

DOI
https://doi.org/10.1007/s41019-019-0091-y
Journal volume & issue
Vol. 4, no. 2
pp. 132 – 140

Abstract

Read online

Abstract This paper proposes a new method called depth difference (DeD), for estimating the optimal number of clusters (k) in a dataset based on data depth. The DeD method estimates the k parameter before actual clustering is constructed. We define the depth within clusters, depth between clusters, and depth difference to finalize the optimal value of k, which is an input value for the clustering algorithm. The experimental comparison with the leading state-of-the-art alternatives demonstrates that the proposed DeD method outperforms.

Keywords