IEEE Access (Jan 2023)
Dynamic and Static Enhanced BIRCH for Functional Data Clustering
Abstract
Accurate and efficient clustering of large-scale functional data is of utmost importance in the era of big data. However, the current research falls short in fully considering the differentiability inherent in functional data. To tackle this significant challenge, we propose a novel method, namely Dynamic and Static Enhanced-BIRCH (DSE-BIRCH), which incorporates both the constant and derivate features to simultaneously measure the static and dynamic distances between functional samples. To this end, a novel matrix factorization-based approach is introduced to transform constant features, extracted through principal component analysis, into derivative features. Subsequently, these two sets of features are fused to form global clustering features with different weighting coefficients are assigned to each of them, reflecting their respective importance. Finally, an enhanced BIRCH algorithm is employed to handle both static and dynamic constraints, enabling hierarchical clustering from a more comprehensive perspective. The mathematical definition of the algorithm is rigorously provided. The superior empirical performance of our method on publicly available datasets and simulated datasets fully demonstrates its effective capture of dynamic information and its capability to achieve accurate clustering on real-world data. Further experiments involving noise and complexity attest to the algorithm’s robustness and efficiency, highlighting its broad potential for applications in various complex scenarios involving large-scale functional data.
Keywords