IEEE Access (Jan 2023)
Genetically Optimized UFLANN for Uncovering Clusters
Abstract
In this work, we present a novel clustering approach which is inheriting the best characteristics of Unsupervised Functional Link Artificial Neural Network (UFLANN) and Genetic Algorithms (GAs) for uncovering clusters embedded in dataset represented through $(X)_{Nxd}$ , where X consists of $N$ data points with $d$ -dimensions. With an aim to realize natural clusters in a linear space UFLANN mapped the input vectors from lower dimension to higher dimension with a greater hope to achieve linearity in higher dimensional space. As a result, UFLANN introduces the problem of curse of dimensionality in the given datasets. However, it has been realized that the problems like sparse data and distance concentration associated with curse of dimensionality cast this problem to again a very complex problem. Hence to address some of the issues of curse of dimensionality, we have used GAs for selecting optimal number of features in the higher dimension for UFLANN to discover clusters embedded in the dataset. The proposed approach herein after named as GAUFLANN has been experimentally evaluated by using the metrics like (i) Davies-Bouldin Index, ii) Silhouette Score, and iii) Completeness score on different synthetic and real datasets. Our experimental study confirms that GAUFLANN is evidently scoring better in DB-index, Silhouette score, and Completeness score than the clustering methods like K-means, Hierarchical-Agglomerative (Average Linkage), and UFLANN across the datasets like Circles, Moons, Iris, and CORD-19.
Keywords