K-Means and K-Medoids: Cluster Analysis on Birth Data Collected in City Muzaffarabad, Kashmir

Syed Ali Abbas; Adil Aslam; Aqeel Ur Rehman; Wajid Arshad Abbasi; Saeed Arif; Syed Zaki Hassan Kazmi

doi:10.1109/ACCESS.2020.3014021

IEEE Access (Jan 2020)

K-Means and K-Medoids: Cluster Analysis on Birth Data Collected in City Muzaffarabad, Kashmir

Syed Ali Abbas,
Adil Aslam,
Aqeel Ur Rehman,
Wajid Arshad Abbasi,
Saeed Arif,
Syed Zaki Hassan Kazmi

Affiliations

Syed Ali Abbas: ORCiD; Department of Computer Science and Information Technology, University of Azad Jammu and Kashmir, Muzaffarabad, Pakistan
Adil Aslam: Department of Computer Science and Information Technology, University of Azad Jammu and Kashmir, Muzaffarabad, Pakistan
Aqeel Ur Rehman: Department of Electronics and Information Engineering, Southwest University, Chongqing, China
Wajid Arshad Abbasi: ORCiD; Department of Computer Science and Information Technology, University of Azad Jammu and Kashmir, Muzaffarabad, Pakistan
Saeed Arif: Department of Computer Science, Saudi Electronic University, Riyadh, Saudi Arabia
Syed Zaki Hassan Kazmi: ORCiD; Department of Computer Science and Information Technology, University of Azad Jammu and Kashmir, Muzaffarabad, Pakistan

DOI: https://doi.org/10.1109/ACCESS.2020.3014021
Journal volume & issue: Vol. 8
pp. 151847 – 151855

Abstract

Read online

In the field of medical, each and every analysis is decisive as the study links to life of the subject under observation. One of the most vital area in the field of medical is the healthcare of expecting women in low income countries. High mortality rate due to increased number of caesarean section is evident because of poor medical infrastructure in the region, misunderstood religious teachings, low education and lack of proper decision making at the right time. The root cause analysis of situations demanding caesarean section is a tough job, however in the presence of historical data, one may extract useful information that will help supporting a medical decision by predicting the outcome. It is obvious that regional disparities have a huge impact on the residents of that region. A study performed on any region cannot be all applicable to the residents of some other distant region. This motive has established grounds to conduct a local study upon the data collected from expecting women in city Muzaffarabad, Kashmir. It is believed that the findings of this study will be significant for women that share more or less similar physical, social and maternal traits. Keeping this in mind, study presents an analysis of two clustering techniques for the investigation of appropriate algorithm that groups data into relevant clusters robustly. Firstly, we analyzed K-means and K-medoids algorithms' capability to cluster the data using different distance metrics. Secondly, data transformation techniques including scale, range and Yeo-Johnson are applied. Finally, transformed data are used in K-means and K-medoids algorithms' to generate cluster accuracy. It is observed that the results produced from transformed data are better than using raw data. Yeo-Johnson transformation method is found best for k-means (Hartigan & Wang), K-medoids (SEV distance function) and Rank k-medoids (SEV distance function) with mean accuracy 67.58%, 69.58% and 72.64% respectively.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords