Diabetes, Metabolic Syndrome and Obesity (Jun 2021)

Artificial Flora Algorithm-Based Feature Selection with Gradient Boosted Tree Model for Diabetes Classification

  • P N,
  • P D,
  • Mansour RF,
  • Almazroa A

Journal volume & issue
Vol. Volume 14
pp. 2789 – 2806

Abstract

Read online

Nagaraj P,1 Deepalakshmi P,1 Romany F Mansour,2 Ahmed Almazroa3 1Department of Computer Science and Engineering, School of Computing, Kalasalingam Academy of Research and Education, Virudhunagar, Tamil Nadu, India; 2Department of Mathematics, Faculty of Science, New Valley University, El-Kharga, Egypt; 3Department of imaging Research, King Abdullah International Medical Research Center, King Saud bin Abdulaziz University for Health Science, Riyadh, Saudi ArabiaCorrespondence: Nagaraj PDepartment of Computer Science and Engineering, School of Computing, Kalasalingam Academy of Research and Education, Anand Nagar, Krishnankoil, Srivilliputtur, Virudhunagar, Tamil Nadu, 626126, IndiaEmail [email protected]: Classification of medical data is essential to determine diabetic treatment options; therefore, the objective of the study was to develop a model to classify the three diabetes type diagnoses according to multiple patient attributes.Methods: Three different datasets are used to develop a novel medical data classification model. The proposed model involved preprocessing, artificial flora algorithm (AFA)-based feature selection, and gradient boosted tree (GBT)-based classification. Then, the processing occurred in two steps, namely, format conversion and data transformation. AFA was applied for selecting features, such as demographics, vital signs, laboratory tests, medications, from the patients’ electronic health records. Lastly, the GBT-based classification model was applied for classifying the patients’ cases to type I, type II, or gestational diabetes mellitus.Results: The effectiveness of the proposed AFA-GBT model was validated using three diabetes datasets to classify patient cases into one of the three different types of diabetes. The proposed model showed a maximum average precision of 91.64%, a recall of 97.46%, an accuracy of 99.93%, an F-score of 94.19%, and a kappa of 96.61%.Conclusion: The AFA-GBT model could classify patient diagnoses into the three diabetes types efficiently.Keywords: diabetes, GBT, feature selection, artificial flora, classification

Keywords