A Criterion for Deciding the Number of Clusters in a Dataset Based on Data Depth

Ishwar Baidari; Channamma Patil

doi:10.1142/S2196888820500232

Vietnam Journal of Computer Science (Nov 2020)

A Criterion for Deciding the Number of Clusters in a Dataset Based on Data Depth

Ishwar Baidari,
Channamma Patil

Affiliations

Ishwar Baidari: Department of Computer Science, Karnatak University, Dharwad, Karnataka 580003, India
Channamma Patil: Department of Computer Science, Karnatak University, Dharwad, Karnataka 580003, India

DOI: https://doi.org/10.1142/S2196888820500232
Journal volume & issue: Vol. 7, no. 4
pp. 417 – 431

Abstract

Read online

Clustering is a key method in unsupervised learning with various applications in data mining, pattern recognition and intelligent information processing. However, the number of groups to be formed, usually notated as k is a vital parameter for most of the existing clustering algorithms as their clustering results depend heavily on this parameter. The problem of finding the optimal k value is very challenging. This paper proposes a novel idea for finding the correct number of groups in a dataset based on data depth. The idea is to avoid the traditional process of running the clustering algorithm over a dataset for n times and further, finding the k value for a dataset without setting any specific search range for k parameter. We experiment with different indices, namely CH, KL, Silhouette, Gap, CSP and the proposed method on different real and synthetic datasets to estimate the correct number of groups in a dataset. The experimental results on real and synthetic datasets indicate good performance of the proposed method.

Published in Vietnam Journal of Computer Science

ISSN: 2196-8888 (Print); 2196-8896 (Online)
Publisher: World Scientific Publishing
Country of publisher: Singapore
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.worldscientific.com/worldscinet/vjcs

About the journal

Abstract

Keywords