IEEE Access (Jan 2024)
Chronic Diseases Prediction Using Machine Learning With Data Preprocessing Handling: A Critical Review
Abstract
According to the World Health Organization (WHO), some chronic diseases such as diabetes mellitus, stroke, cancer, cardiac vascular, kidney failure, and hypertension are essential for early prevention. One of the prevention that can be taken is to predict chronic diseases using machine learning based on personal medical record or general checkup result. The common prediction objective is to minimize the prediction error as low as possible. The most influencing chronic diseases prediction factors are the quality of data and the choice of predictor such as machine learning methods. The five main problems those lower data quality are outliers, missing values, feature selection, normalization, and imbalance. After we ensure the quality of data, the next task is to choose the best machine learning methods. The most influencing factor to consider when we choose the predictor its performance evaluation (accuracy, recall, precision, f1-score). Thus, predicting chronic disease aims to produce increased performance and solve problems in medical data. This paper presents a Systematic Literature Review (SLR) that offers a comprehensive discussion of research on chronic diseases prediction using machine learning and its data preprocessing handling. This paper covers machine learning methods discussion such as supervised learning, ensemble learning, deep learning, and reinforcement learning. The preprocessing handling we discuss includes missing values, outliers, feature selection, normalization, and imbalance. The final discussions of this paper are open issues, and the potential future works in improving the prediction performance for chronic diseases using a data preprocessing handling and machine learning methods.
Keywords