Teknika (Jun 2024)
Hierarchical clustering algorithm-dendogram using Euclidean and Manhattan distance
Abstract
This paper presents the outcomes of a research experiment on the drying process of seaweed. There are numerous approaches to clustering data, such as partitioning and the Hierarchical Clustering Algorithm (HCA). The HCA has been implemented in binary tree structures to visualize data clustering. We conducted a comparative analysis of the four primary methodologies utilized in HCA, namely: 1) single linkage, 2) complete linkage, 3) average linkage, and 4) Ward's linkage. Clustering validation is widely recognized as a crucial issue that significantly impacts the effectiveness of clustering algorithms. Clustering validation can be identified, such as internal and external validation. Internal clustering validation, in particular, holds significant importance in the realm of data science. With this article, the main goal is to do an empirical evaluation of the traits that a representative set of internal clustering validation indices, namely Connectivity, Dunn, and Silhouette, show. In this paper, the HCA applies two distance functions between Euclidean and Manhattan distances to analyze the entanglement function and internal validity.
Keywords