Entropy (Jun 2023)

Combined Gaussian Mixture Model and Pathfinder Algorithm for Data Clustering

  • Huajuan Huang,
  • Zepeng Liao,
  • Xiuxi Wei,
  • Yongquan Zhou

DOI
https://doi.org/10.3390/e25060946
Journal volume & issue
Vol. 25, no. 6
p. 946

Abstract

Read online

Data clustering is one of the most influential branches of machine learning and data analysis, and Gaussian Mixture Models (GMMs) are frequently adopted in data clustering due to their ease of implementation. However, there are certain limitations to this approach that need to be acknowledged. GMMs need to determine the cluster numbers manually, and they may fail to extract the information within the dataset during initialization. To address these issues, a new clustering algorithm called PFA-GMM has been proposed. PFA-GMM is based on GMMs and the Pathfinder algorithm (PFA), and it aims to overcome the shortcomings of GMMs. The algorithm automatically determines the optimal number of clusters based on the dataset. Subsequently, PFA-GMM considers the clustering problem as a global optimization problem for getting trapped in local convergence during initialization. Finally, we conducted a comparative study of our proposed clustering algorithm against other well-known clustering algorithms using both synthetic and real-world datasets. The results of our experiments indicate that PFA-GMM outperformed the competing approaches.

Keywords