Attribute Selecting in Tree-Augmented Naive Bayes by Cross Validation Risk Minimization

Shenglei Chen; Zhonghui Zhang; Linyuan Liu

doi:10.3390/math9202564

Mathematics (Oct 2021)

Attribute Selecting in Tree-Augmented Naive Bayes by Cross Validation Risk Minimization

Shenglei Chen,
Zhonghui Zhang,
Linyuan Liu

Affiliations

Shenglei Chen: Department of E-Commerce, Nanjing Audit University, Nanjing 211815, China
Zhonghui Zhang: School of Finance, Nanjing Audit University, Nanjing 211815, China
Linyuan Liu: Department of E-Commerce, Nanjing Audit University, Nanjing 211815, China

DOI: https://doi.org/10.3390/math9202564
Journal volume & issue: Vol. 9, no. 20
p. 2564

Abstract

Read online

As an important improvement to naive Bayes, Tree-Augmented Naive Bayes (TAN) exhibits excellent classification performance and efficiency since it allows that every attribute depends on at most one other attribute in addition to the class variable. However, its performance might be lowered as some attributes might be redundant. In this paper, we propose an attribute Selective Tree-Augmented Naive Bayes (STAN) algorithm which builds a sequence of approximate models each involving only the top certain attributes and searches the model to minimize the cross validation risk. Five different approaches to ranking the attributes have been explored. As the models can be evaluated simultaneously in one pass learning through the data, it is efficient and can avoid local optima in the model space. The extensive experiments on 70 UCI data sets demonstrated that STAN achieves superior performance while maintaining the efficiency and simplicity.

Published in Mathematics

ISSN: 2227-7390 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/mathematics

About the journal

Abstract

Keywords