IEEE Access (Jan 2020)

K-Means and K-Medoids: Cluster Analysis on Birth Data Collected in City Muzaffarabad, Kashmir

  • Syed Ali Abbas,
  • Adil Aslam,
  • Aqeel Ur Rehman,
  • Wajid Arshad Abbasi,
  • Saeed Arif,
  • Syed Zaki Hassan Kazmi

DOI
https://doi.org/10.1109/ACCESS.2020.3014021
Journal volume & issue
Vol. 8
pp. 151847 – 151855

Abstract

Read online

In the field of medical, each and every analysis is decisive as the study links to life of the subject under observation. One of the most vital area in the field of medical is the healthcare of expecting women in low income countries. High mortality rate due to increased number of caesarean section is evident because of poor medical infrastructure in the region, misunderstood religious teachings, low education and lack of proper decision making at the right time. The root cause analysis of situations demanding caesarean section is a tough job, however in the presence of historical data, one may extract useful information that will help supporting a medical decision by predicting the outcome. It is obvious that regional disparities have a huge impact on the residents of that region. A study performed on any region cannot be all applicable to the residents of some other distant region. This motive has established grounds to conduct a local study upon the data collected from expecting women in city Muzaffarabad, Kashmir. It is believed that the findings of this study will be significant for women that share more or less similar physical, social and maternal traits. Keeping this in mind, study presents an analysis of two clustering techniques for the investigation of appropriate algorithm that groups data into relevant clusters robustly. Firstly, we analyzed K-means and K-medoids algorithms' capability to cluster the data using different distance metrics. Secondly, data transformation techniques including scale, range and Yeo-Johnson are applied. Finally, transformed data are used in K-means and K-medoids algorithms' to generate cluster accuracy. It is observed that the results produced from transformed data are better than using raw data. Yeo-Johnson transformation method is found best for k-means (Hartigan & Wang), K-medoids (SEV distance function) and Rank k-medoids (SEV distance function) with mean accuracy 67.58%, 69.58% and 72.64% respectively.

Keywords