Scientific Reports (Dec 2024)

Uncovering the impact of outliers on clusters’ evolution in temporal data-sets: an empirical analysis

  • Muhammad Atif,
  • Muhammad Farooq,
  • Muhammad Shafiq,
  • Tmader Alballa,
  • Somayah Abdualziz Alhabeeb,
  • Hamide Abd El-Wahed Khalifa

DOI
https://doi.org/10.1038/s41598-024-75928-7
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 12

Abstract

Read online

Abstract This study investigates the impact of outliers on the evolution of clusters in temporal data-sets. Monitoring and tracing cluster transitions of temporal data sets allow us to observe how clusters evolve and change over time. By tracking the movement of data points between clusters, we can gain insights into the underlying patterns, trends, and dynamics of the data. This understanding is essential for making informed decisions and drawing meaningful conclusions from the clustering results. Cluster evolution refers to the changes that occur in the clustering results over time due to the arrival of new data points. The changes in cluster solutions are classified as external and internal transitions. The study employs the survival ratio and history cost function to investigate the effects of outliers on changes experienced by the clusters at successive time points. The results demonstrate that outliers have a significant impact on cluster evolution, and appropriate outlier handling techniques are necessary to obtain reliable clustering results. The findings of this study provide useful insights for practitioners and researchers in the field of stream clustering and can help guide the development of more robust and accurate stream clustering algorithms.

Keywords