Clustering Validation Inference

Pau Figuera; Alfredo Cuzzocrea; Pablo García Bringas

doi:10.3390/math12152349

Mathematics (Jul 2024)

Clustering Validation Inference

Pau Figuera,
Alfredo Cuzzocrea,
Pablo García Bringas

Affiliations

Pau Figuera: Faculty of Engineering, University of Deusto, 48007 Bilbao, Spain
Alfredo Cuzzocrea: iDEA Lab, University of Calabria, 87036 Rende, Italy
Pablo García Bringas: Faculty of Engineering, University of Deusto, 48007 Bilbao, Spain

DOI: https://doi.org/10.3390/math12152349
Journal volume & issue: Vol. 12, no. 15
p. 2349

Abstract

Read online

Clustering validation is applied to evaluate the quality of classifications. This step is crucial for unsupervised machine learning. A plethora of methods exist for this purpose; however, a common drawback is that statistical inference is not possible. In this study, we construct a density function for the cluster number. For this purpose, we use smooth techniques. Then, we apply non-negative matrix factorization using the Kullback–Leibler divergence. Employing a unique linearly independent uncorrelated observational variable hypothesis, we construct a sequence by varying the dimension of the span space of the factorization only using analytical techniques. The expectation of the limit of this sequence follows a gamma probability density function. Then, identifying the dimension of the factorization of the space span with clusters, we transform the estimation of the suitable dimension of the factorization into a probabilistic estimate of the number of clusters. This approach is an internal validation method that is suitable for numerical and categorical multivariate data and independent of the clustering technique. Our main achievement is a predictive clustering validation model with graphical abilities. It provides results in terms of credibility, thus making it possible to compare results such as expert judgment on a quantitative basis.

Published in Mathematics

ISSN: 2227-7390 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/mathematics

About the journal

Abstract

Keywords