Machine Learning with Applications (Dec 2022)

Designing a supervised feature selection technique for mixed attribute data analysis

  • Dong Hyun Jeong,
  • Bong Keun Jeong,
  • Nandi Leslie,
  • Charles Kamhoua,
  • Soo-Yeon Ji

Journal volume & issue
Vol. 10
p. 100431

Abstract

Read online

Identifying optimal features is critical for increasing the overall performance of data classification. This paper introduces a supervised feature selection technique for analyzing mixed attribute data. It measures data classification performances of features with a user-defined performance criterion and determines optimal features to boost the overall data analysis performance. A performance evaluation is managed to highlight the usefulness of the technique with existing feature selection techniques such as analysis of variance test, chi-square test, principal component analysis, and mutual information. Visualization is also utilized to understand the differences in classifying instances with different features. From a comparative performance testing and evaluation, we found 5 ∼ 10% performance improvements with the proposed technique. Overall, evaluation results showed the usefulness of our proposed feature selection technique in mixed attribute data analysis.

Keywords