Romanian Journal of Oral Rehabilitation (Jun 2024)
THE EFFICIENCY OF QUALITATIVE DATA CLUSTERING IN MEDICAL AND SOCIO-ECONOMIC SURVEYS – COMPARATIVE STUDY
Abstract
Clustering is a complex data mining tool, useful to identify similarities in large amount of data, the medical databases being highly suitable in this regard. Our paper aims to compare the efficacy of two well-known clustering methods, the n-means algorithm and the classical hierarchical algorithm, and to apply them in analyzing a medical-economic database on dietary habits, social economic status and oral health in a sample of 326 men, aged between 25 and 30, living in the urban area – in order to identify possible associations between dietary habits and income levels. We identified 4 clusters which correspond partially to the 4 income levels recorded in the investigated sample and reveal the associated dietary habits. The n-means clustering performed better than the Single Linkage hierarchical classification, being therefore highly suitable in the analysis of socio-economic and general health data.
Keywords