Jurnal Riset Informatika (Mar 2022)
THE EFFECTIVENESS ANALYSIS OF RANDOM FOREST ALGORITHMS WITH SMOTE TECHNIQUE IN PREDICTING LUNG CANCER RISK
Abstract
When compared with other types of cancer, most of the population with cancer die from lung cancer.A person needs to do a screening test through X-rays, CT scans, and MRI to detect the disease. However, before carrying out the process, the doctor will ordinarily investigate a medical history and physical examination first to study the symptoms and possible risk factors for lung cancer. The lung cancer data set has a class imbalance that affects the performance of the random forest algorithm in predicting the risk of lung cancer. This study aims to employ the SMOTE technique to the random forest algorithm to increase accuracy in predicting lung cancer risk. In this research, data processing and analysis use the Python programming language. The test results show an accuracy value of 88% with an AUC value of 0.93. When employing the random forest method to forecast lung cancer risk, the SMOTE technique is useful in dealing with class imbalances in the data set.
Keywords