Network Biology (Sep 2021)

Diagnosis of diabetes: A machine learning paradigm using optimized features

  • Rafid Mostafiz,
  • Khandaker Mohammad Mohi Uddin,
  • Mohammad Shorif Uddin, et al.

Journal volume & issue
Vol. 11, no. 3
pp. 222 – 240

Abstract

Read online

Diabetes is considered one of the incurable diseases at present which is caused by hyperglycemia. Modern healthcare finds some attributes such as uncontrolled lifestyle, lack of balanced diets, genetic complexities, excess mental fatigue, obesities, and so on, which are responsible to precipitate the rapid mobility of diabetes diseases. This is not only a single disease but it also damages the nervous systems, heart, kidney, liver, eyes, and various organic metabolisms. Currently, the clinical industries have a huge amount of data for the diagnosis of diabetic patients. Machine learning algorithms can work appropriately to mitigate this tedious task in finding hidden patterns, discovering knowledge from the database, and predict outcomes. This research has proposed an efficient machine learning-based diagnosis methodology that outperforms the existing similar methodologies. The experiment selects the minimum Redundancy Maximum Relevance (mRMR) features from the working dataset and then recursive feature elimination (RFE) technique for optimization. The irregularity problem in the dataset is addressed by the synthetic minority oversampling technique (SMOTE). Machine learning classification is performed on the selected optimized features through Decision Tree (C4.5 DT), K-Nearest Neighbors (KNN), Naive Bayes (NBs), Support Vector Machine (SVM), Logistic Regression (LGR), and Random Forest (RF), where RF classifier produces best-suited results with minimum false detection rate. This experiment has used a 5-fold cross-validation approach to justify the reliability of the proposed model and finally obtain an accuracy of 98.10%.

Keywords