Mathematics (Oct 2021)

Attribute Selecting in Tree-Augmented Naive Bayes by Cross Validation Risk Minimization

  • Shenglei Chen,
  • Zhonghui Zhang,
  • Linyuan Liu

DOI
https://doi.org/10.3390/math9202564
Journal volume & issue
Vol. 9, no. 20
p. 2564

Abstract

Read online

As an important improvement to naive Bayes, Tree-Augmented Naive Bayes (TAN) exhibits excellent classification performance and efficiency since it allows that every attribute depends on at most one other attribute in addition to the class variable. However, its performance might be lowered as some attributes might be redundant. In this paper, we propose an attribute Selective Tree-Augmented Naive Bayes (STAN) algorithm which builds a sequence of approximate models each involving only the top certain attributes and searches the model to minimize the cross validation risk. Five different approaches to ranking the attributes have been explored. As the models can be evaluated simultaneously in one pass learning through the data, it is efficient and can avoid local optima in the model space. The extensive experiments on 70 UCI data sets demonstrated that STAN achieves superior performance while maintaining the efficiency and simplicity.

Keywords