Pilar Nusa Mandiri (Sep 2024)
CLASSIFICATION OF HEART DISEASE USING THE K-NEAREST NEIGHBOR ALGORITHM AND LOGISTIC REGRESSION
Abstract
Heart disease is a major cause of death in the world, including in Indonesia, with increasing rates and death rates that carry a huge burden on health and society. Lack of awareness of early signs contributes significantly to this challenge. This study aims to prevent heart disease through early diagnosis using K-Nearest Neighbor (K-NN) and Logistic Regression algorithms. The database, obtained from Kaggle.com, includes 15 clinical units for cardiac diagnosis. The test shows that the K-NN method with k = 3 achieves the highest performance on the experimental data (30%), with 90% precision, 93% precision, 87% recall, and 90% f1 - score. In comparison, Logistic Regression and sigmoid achieved 86% precision, 83% precision, 90% recall, and 86% f1-score on the same experimental data. These results show that K-Nearest Neighbor is better than Logistic Regression as a classification algorithm for heart disease database. Applying these findings to the web-based Streamlit system is expected to improve the efficiency and timeliness of heart disease screening.
Keywords