K-means-SMOTE for handling class imbalance in the classification of diabetes with C4.5, SVM, and naive Bayes

Hairani Hairani; Khurniawan Eko Saputro; Sofiansyah Fadli

doi:10.14710/jtsiskom.8.2.2020.89-93

Jurnal Teknologi dan Sistem Komputer (Apr 2020)

K-means-SMOTE for handling class imbalance in the classification of diabetes with C4.5, SVM, and naive Bayes

Hairani Hairani,
Khurniawan Eko Saputro,
Sofiansyah Fadli

Affiliations

Hairani Hairani: ORCiD; Program Studi Ilmu Komputer, Fakultas Teknik dan Kesehatan, Universitas Bumigora, Indonesia
Khurniawan Eko Saputro: Program Studi Teknologi Informasi, Fakultas Teknik dan Kesehatan, Universitas Bumigora, Indonesia
Sofiansyah Fadli: Program Studi Teknik Informatika, Sekolah Tinggi Manajemen Informatika dan Komputer Lombok, Indonesia

DOI: https://doi.org/10.14710/jtsiskom.8.2.2020.89-93
Journal volume & issue: Vol. 8, no. 2
pp. 89 – 93

Abstract

Read online

The occurrence of imbalanced class in a dataset causes the classification results to tend to the class with the largest amount of data (majority class). A sampling method is needed to balance the minority class (positive class) so that the class distribution becomes balanced and leading to better classification results. This study was conducted to overcome imbalanced class problems on the Indian Pima diabetes illness dataset using k-means-SMOTE. The dataset has 268 instances of the positive class (minority class) and 500 instances of the negative class (majority class). The classification was done by comparing C4.5, SVM, and naïve Bayes while implementing k-means-SMOTE in data sampling. Using k-means-SMOTE, the SVM classification method has the highest accuracy and sensitivity of 82 % and 77 % respectively, while the naive Bayes method produces the highest specificity of 89 %.

Published in Jurnal Teknologi dan Sistem Komputer

ISSN: 2620-4002 (Print); 2338-0403 (Online)
Publisher: Diponegoro University
Country of publisher: Indonesia
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://jtsiskom.undip.ac.id/index.php/jtsiskom

About the journal

Abstract

Keywords