Measurement: Sensors (Feb 2023)
Performance analysis of Kmeans with modified initial centroid selection algorithms and developed Kmeans9+ model
Abstract
Partition the sample space to the user requirement is an easy and efficient cluster method in many applications. Kmeans is one of the simple and efficient partition algorithms used in many cluster solutions. We suggest Multivariant Silhouette method to predict the cluster count value for Kmeans algorithm. The work proposes three new algorithms for initial seed selection. The Kmeans algorithm is modified using statistical measure Mean, Median, Partition centre for the cluster centroid calculation. In conventional Kmeans algorithm, the samples are compared with all the partition centroids to decide the inclusion of a sample in a cluster. In Kmeans9+ algorithm, the samples are compared only with the centroids of current and eight nearest neighbouring cluster partitions. It reduces the needless comparisons of samples with cluster centroids and improve the efficiency of the algorithm. The performance of created algorithms is evaluated using data from UCI and KBPE SSLC. Cluster efficiency analysis is done using the cluster evaluation indices such as Silhouette and Dunn Index. Nine nearest neighbour uniform partition cluster model Kmeans9+ improve the performance of the K means algorithm and obtain the natural cluster results with minimum iterations.