IEEE Access (Jan 2019)
Optimization of Basic Clustering for Ensemble Clustering: An Information-Theoretic Perspective
Abstract
The current research on ensemble clustering mainly focuses on integration strategies, but the attention regarding the measurement and optimization of basic cluster is less emphasized. Based on the information entropy theory, this paper proposes a quality metric of basic cluster, and the clusterings are further selected by incorporating two-branch decisions and three-way decisions respectively. Determined by preset threshold(s), mechanism of two-branch based basic clustering filtering (BCF2BD) and three-way based basic clustering filtering (BCF3WD) are developed. Concretely, the basic clustering in BCF2BD is deleted if the quality metric of it is less than the preset threshold ξ, and the new clustering member is added to maintain the basic cluster set count. The basic clustering in BCF3WD is deleted if the quality metric of it is less than the preset threshold β, retained if the quality metric of it is greater than the preset threshold α, recalculated if the quality metric of it is greater than β and less than α. Both mechanism executed repeatedly until either non-decrement of basic clusters occurred or maximum iteration count reached. Contrastive experiments show that both methods of filtering algorithms can effectively improve the performance of ensemble clustering, and the three-way decisions filtering algorithm get less time consumption.
Keywords