Machine Learning and Knowledge Extraction (Aug 2023)
Identifying the Regions of a Space with the Self-Parameterized Recursively Assessed Decomposition Algorithm (SPRADA)
Abstract
This paper introduces a non-parametric methodology based on classical unsupervised clustering techniques to automatically identify the main regions of a space, without requiring the objective number of clusters, so as to identify the major regular states of unknown industrial systems. Indeed, useful knowledge on real industrial processes entails the identification of their regular states, and their historically encountered anomalies. Since both should form compact and salient groups of data, unsupervised clustering generally performs this task fairly accurately; however, this often requires the number of clusters upstream, knowledge which is rarely available. As such, the proposed algorithm operates a first partitioning of the space, then it estimates the integrity of the clusters, and splits them again and again until every cluster obtains an acceptable integrity; finally, a step of merging based on the clusters’ empirical distributions is performed to refine the partitioning. Applied to real industrial data obtained in the scope of a European project, this methodology proved able to automatically identify the main regular states of the system. Results show the robustness of the proposed approach in the fully-automatic and non-parametric identification of the main regions of a space, knowledge which is useful to industrial anomaly detection and behavioral modeling.
Keywords