IEEE Access (Jan 2021)

Individual Attribute Selection Using Information Gain Based Distance for Group Classification of Elderly People With Hypertension

  • Supansa Chaising,
  • Punnarumol Temdee,
  • Ramjee Prasad

DOI
https://doi.org/10.1109/ACCESS.2021.3084623
Journal volume & issue
Vol. 9
pp. 82713 – 82725

Abstract

Read online

Attribute selection is the process of selecting relevant attributes being used in model construction to enhance model accuracy. For general medical oriented classification applications, classical attribute selection methods principally select common attributes in the dataset for all individuals. The idea of using individual attributes is proposed in this study to represent the difference among individuals for self-diagnosis. Consequently, this study proposes a new attribute selection method, called information gain based distance (IGD), for individual attribute selection, which represents an individual’s health condition differently and can be used for effective classification. The proposed method combines the concept of information gain and objective distance to select individual attributes affecting classification. The IGD method is expected to provide higher classification performance than classical attribute selection methods. To assess the performance of the IGD method, classification accuracy between data with classical attribute selections and with the IGD method is compared. The case study is conducted with 971 secondary data used for group classification of elderly people with hypertension. The classification result of different classifiers was compared, including K-nearest neighbors, neural network, and naive Bayes. The comparison revealed that the classification of data with the IGD attribute selection method provided an average classification accuracy of 98.73%. In comparison, those classifications of data with classical attribute selection methods provided 62.99%, 62.99%, 62.62%, and 62.85% for information gain, Gini index, chi-squared, and decision tree, respectively. The results showed that data classification with the IGD method provided higher performance than those with the classical attribute selection methods.

Keywords