Intelligent Medicine (Aug 2024)

A clinical decision support system using rough set theory and machine learning for disease prediction

  • Kamakhya Narain Singh,
  • Jibendu Kumar Mantri

Journal volume & issue
Vol. 4, no. 3
pp. 200 – 208

Abstract

Read online

Objective: Technological advances have led to drastic changes in daily life, and particularly healthcare, while traditional diagnosis methods are being replaced by technology-oriented models and paper-based patient healthcare records with digital files. Using the latest technology and data mining techniques, we aimed to develop an automated clinical decision support system (CDSS), to improve patient prognoses and healthcare delivery. Our proposed approach placed a strong emphasis on improvements that meet patient, parent, and physician expectations. We developed a flexible framework to identify hepatitis, dermatological conditions, hepatic disease, and autism in adults and provide results to patients as recommendations. The novelty of this CDSS lies in its integration of rough set theory (RST) and machine learning (ML) techniques to improve clinical decision-making accuracy and effectiveness. Methods: Data were collected through various web-based resources. Standard preprocessing techniques were applied to encode categorical features, conduct min-max scaling, and remove null and duplicate entries. The most prevalent feature in the class and standard deviation were used to fill missing categorical and continuous feature values, respectively. A rough set approach was applied as feature selection, to remove highly redundant and irrelevant elements. Then, various ML techniques, including K nearest neighbors (KNN), linear support vector machine (LSVM), radial basis function support vector machine (RBF SVM), decision tree (DT), random forest (RF), and Naive Bayes (NB), were employed to analyze four publicly available benchmark medical datasets of different types from the UCI repository and Kaggle. The model was implemented in Python, and various validity metrics, including precision, recall, F1-score, and root mean square error (RMSE), applied to measure its performance. Results: Features were selected using an RST approach and examined by RF analysis and important features of hepatitis, dermatology conditions, hepatic disease, and autism determined by RST and RF exhibited 92.85 %, 90.90 %, 100 %, and 80 % similarity, respectively. Selected features were stored as electronic health records and various ML classifiers, such as KNN, LSVM, RBF SVM, DT, RF, and NB, applied to classify patients with hepatitis, dermatology conditions, hepatic disease, and autism. In the last phase, the performance of proposed classifiers was compared with that of existing state-of-the-art methods, using various validity measures. RF was found to be the best approach for adult screening of: hepatitis with accuracy 88.66 %, precision 74.46 %, recall 75.17 %, F1-score 74.81 %, and RMSE value 0.244; dermatology conditions with accuracy 97.29 %, precision 96.96 %, recall 96.96 %, F1-score 96.96 %, and RMSE value, 0.173; hepatic disease, with accuracy 91.58 %, precision 81.76 %, recall 81.82 %, F1-Score 81.79 %, and RMSE value 0.193; and autism, with accuracy 100 %, precision 100 %, recall 100 %, F1-score 100 %, and RMSE value 0.064. Conclusion: The overall performance of our proposed framework may suggest that it could assist medical experts in more accurately identifying and diagnosing patients with hepatitis, dermatology conditions, hepatic disease, and autism.

Keywords