A Generalized Clustering Method Based on Validity Indices and Membership Functions

Edwin Aldana-Bobadilla; Ivan Lopez-Arevalo; Hiram Galeana-Zapien; Melesio Crespo-Sanchez

doi:10.1109/ACCESS.2018.2882408

IEEE Access (Jan 2018)

A Generalized Clustering Method Based on Validity Indices and Membership Functions

Edwin Aldana-Bobadilla,
Ivan Lopez-Arevalo,
Hiram Galeana-Zapien,
Melesio Crespo-Sanchez

Affiliations

Edwin Aldana-Bobadilla: Conacyt-Centro de Investigación y de Estudios Avanzados del I.P.N. (Cinvestav), Unidad Tamaulipas, Ciudad Victoria, Mexico
Ivan Lopez-Arevalo: Centro de Investigación y de Estudios Avanzados del I.P.N. (Cinvestav), Unidad Tamaulipas, Ciudad Victoria, Mexico
Hiram Galeana-Zapien: Centro de Investigación y de Estudios Avanzados del I.P.N. (Cinvestav), Unidad Tamaulipas, Ciudad Victoria, Mexico
Melesio Crespo-Sanchez: Centro de Investigación y de Estudios Avanzados del I.P.N. (Cinvestav), Unidad Tamaulipas, Ciudad Victoria, Mexico

DOI: https://doi.org/10.1109/ACCESS.2018.2882408
Journal volume & issue: Vol. 6
pp. 75912 – 75923

Abstract

Read online

Clustering is an important task in data analysis to find a partition on an unlabeled dataset based on similarity relationships among its elements. Typically, such similarity is determined by a proximity measure or distance. Then, the optimal partition is the one that minimizes the distance among elements belonging to the same subset and maximizes the distance among elements from different subsets. The way in which the optimal partition is found is called clustering method. The adequateness of the partition found is commonly determined in terms of a validity index. In this paper, we propose a clustering method referred to as quality-driven search for optimal partition (QDSOC) where the search process of the optimal partition is directly driven by a validity index instead of a proximity measure. Our approach allows to efficiently exploring a large solution space via a breed of genetic algorithm, the so-called eclectic genetic algorithm. Unlike existing clustering methods, the proposed QDSOC offers the optimal partition and provides the mathematical model of such partition in terms of a representation based on membership functions. This model describes the points that belong to the subsets in the partition found. Thus, by using this model, we can predict the membership of new objects without performing the search process again. As part of the experimental evaluation, our proposed QDSOC method is compared with k-means and self-organizing maps (SOMs), which are two well-known clustering approaches. The clustering methods were used to solve a wide sample of clustering problems, and using three different validity indices. From the obtained results, we demonstrate that QDSOC statistically outperforms k-means and SOMs. We also point out that our approach does not incur in excessive computational overhead with respect to such traditional clustering methods.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords