Journal of King Saud University: Computer and Information Sciences (Apr 2015)

ModEx and Seed-Detective: Two novel techniques for high quality clustering by using good initial seeds in K-Means

  • Md Anisur Rahman,
  • Md Zahidul Islam,
  • Terry Bossomaier

DOI
https://doi.org/10.1016/j.jksuci.2014.04.002
Journal volume & issue
Vol. 27, no. 2
pp. 113 – 128

Abstract

Read online

In this paper we present two clustering techniques called ModEx and Seed-Detective. ModEx is a modified version of an existing clustering technique called Ex-Detective. It addresses some limitations of Ex-Detective. Seed-Detective is a combination of ModEx and Simple K-Means. Seed-Detective uses ModEx to produce a set of high quality initial seeds that are then given as input to K-Means for producing the final clusters. The high quality initial seeds are expected to produce high quality clusters through K-Means. The performances of Seed-Detective and ModEx are compared with the performances of Ex-Detective, PAM, Simple K-Means (SK), Basic Farthest Point Heuristic (BFPH) and New Farthest Point Heuristic (NFPH). We use three cluster evaluation criteria namely F-measure, Entropy and Purity and four natural datasets that we obtain from the UCI Machine learning repository. In the datasets our proposed techniques perform better than the existing techniques in terms of F-measure, Entropy and Purity. The sign test results suggest a statistical significance of the superiority of Seed-Detective (and ModEx) over the existing techniques.

Keywords