Jurnal Informatika (Nov 2024)
Implementation of PPCA Imputation, SMOTE-N Class Balancing in Hepatitis Classification Using Naïve Bayes
Abstract
The availability of complete data in research is crucial, especially in the initial stages. The Hepatitis data used in this study encountered issues such as missing data and class imbalance, which hindered its optimal utilization. The method employed to address missing data was the PPCA imputation method. After filling in the missing data, the data was balanced using the SMOTE-N class balancing method and classified using Gaussian Naïve Bayes. The aim of this research was to compare the classification evaluation of hepatitis disease using Naive Bayes with the PPCA imputation approach and SMOTE-N class balancing. The best results from each scenario yielded an AUC value of 0.833 in the first scenario with an 80:20 data split for training and testing, and 0.875 in the second scenario with a 90:10 data split. The highest AUC value was obtained in the application of PPCA imputation with SMOTE-N class balancing using Naive Bayes classification. This demonstrates that the implementation of PPCA imputation with SMOTE-N class balancing has a better impact on the performance of Naïve Bayes classification.
Keywords