Tehran University Medical Journal (Mar 2019)

Designing an intelligent system for diagnosing type 2 diabetes using the data mining approach: brief report

  • Rohollah Kalhor,
  • Asghar Mortezagholi,
  • Fatemeh Naji,
  • Saeed Shahsavari,
  • Mohammad Zakaria Kiaei

Journal volume & issue
Vol. 76, no. 12
pp. 827 – 831

Abstract

Read online

Background: Diabetes mellitus has several complications. The Late diagnosis of diabetes in people leads to the spread of complications. Therefore, this study has been done to determine the possibility of predicting diabetes type 2 by using data mining techniques. Methods: This is a descriptive-analytic study that was conducted as a cross-sectional study. The study population included people referring to health centers in Mohammadieh City in Qazvin Province, Iran, from April to June 2015 for screening for diabetes. The 5-step CRISP method was used to implement this study. Data were collected from March 2015 to June 2015. In this study, 1055 persons with complete information were included in the study. Of these, 159 were healthy and 896 were diabetic. A total of 11 characteristics and risk factors were examined, including the age, sex, systolic and diastolic blood pressure, family history of diabetes, BMI, height, weight, waistline, hip circumference and diagnosis. The results obtained by support vector machine (SVM), decision tree (DT) and the k-nearest neighbors algorithm (k-NN) were compared with each other. Data was analyzed using MATLAB® software, version 3.2 (Mathworks Inc., Natick, MA, USA). Results: Data analysis showed that in all criteria, the best results were obtained by decision tree with accuracy (0.96) and precision (0.89). The k-NN methods were followed by accuracy (0.96) and precision (0.83) and support vector machine with accuracy (0.94) and precision (0.85). Also, in this study, decision tree model obtained the highest degree of class accuracy for both diabetes classes and healthy in the analysis of confusion matrix. Conclusion: Based on the results, the decision tree represents the best results in the class of test samples which can be recommended as a model for predicting diabetes type 2 using risk factor data.

Keywords