Engineering Proceedings (Jan 2024)

Quantitative Comparison of Machine Learning Clustering Methods for Tuberculosis Data Analysis

  • Marlen Kossakov,
  • Assel Mukasheva,
  • Gani Balbayev,
  • Syrym Seidazimov,
  • Dinargul Mukammejanova,
  • Madina Sydybayeva

DOI
https://doi.org/10.3390/engproc2024060020
Journal volume & issue
Vol. 60, no. 1
p. 20

Abstract

Read online

In many fields, data-driven decision making has become essential due to machine learning (ML), which provides insights that improve productivity and quality of life. A basic machine learning approach called clustering helps find comparable data points. Clustering plays a critical role in the identification of patient subgroups and the customisation of treatment in the context of tuberculosis (TB) research. While prior studies have recognized its utility, a comprehensive comparative analysis of multiple clustering methods applied to TB data is lacking. Using TB data, this study thoroughly assesses and contrasts four well-known machine learning clustering algorithms: spectral clustering, DBSCAN, hierarchical clustering, and k-means. To evaluate the quality of a cluster, quantitative measures such as the silhouette score, Davies–Bouldin index, and Calinski–Harabasz index are utilised. The results provide quantitative insights that enhance comprehension of clustering and guide future research.

Keywords